Next Article in Journal
Analytical Methods for Nonlinear Evolution Equations in Mathematical Physics
Next Article in Special Issue
Solving the Coriolis Vibratory Gyroscope Motion Equations by Means of the Angular Rate B-Spline Approximation
Previous Article in Journal
Homogenization of a 2D Tidal Dynamics Equation
Previous Article in Special Issue
To Google or Not: Differences on How Online Searches Predict Names and Faces
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Using Parameter Elimination to Solve Discrete Linear Chebyshev Approximation Problems

Faculty of Mthematics and Mechanics, St. Petersburg State University, Universitetskaya Emb. 7/9, 199034 St. Petersburg, Russia
Mathematics 2020, 8(12), 2210; https://doi.org/10.3390/math8122210
Submission received: 18 November 2020 / Revised: 7 December 2020 / Accepted: 9 December 2020 / Published: 13 December 2020
(This article belongs to the Special Issue Approximation Theory and Methods 2020)

Abstract

:
We consider discrete linear Chebyshev approximation problems in which the unknown parameters of linear function are fitted by minimizing the least maximum absolute deviation of errors. Such problems find application in the solution of overdetermined systems of linear equations that appear in many practical contexts. The least maximum absolute deviation estimator is used in regression analysis in statistics when the distribution of errors has bounded support. To derive a direct solution of the problem, we propose an algebraic approach based on a parameter elimination technique. As a key component of the approach, an elimination lemma is proved to handle the problem by reducing it to a problem with one parameter eliminated, together with a box constraint imposed on this parameter. We demonstrate the application of the lemma to the direct solution of linear regression problems with one and two parameters. We develop a procedure to solve multidimensional approximation (multiple linear regression) problems in a finite number of steps. The procedure follows a method that comprises two phases: backward elimination and forward substitution of parameters. We describe the main components of the procedure and estimate its computational complexity. We implement symbolic computations in MATLAB to obtain exact solutions for two numerical examples.

1. Introduction

Discrete linear Chebyshev (minimax) approximation problems where the errors of fitting the unknown parameters are measured by the Chebyshev (max, infinity, uniform or L ) norm are of theoretical interest and practical importance in many areas of science and engineering. Application of the Chebyshev norm leads to the least maximum absolute deviation of errors as the approximation criterion, and dates back to Laplace’s classical work [1] (book 3, chap. V, §39) (see also [2,3]).
An important area of applications of the discrete linear Chebyshev approximation is the solution of overdetermined systems of linear equations [4,5,6] that appear in many practical contexts. The least maximum absolute deviation estimator is widely used in regression analysis in statistics when the distribution of errors has bounded support. Specifically, the Chebyshev estimator is known to be a maximum likelihood estimator if the error distribution is uniform [5,7,8,9]. Moreover, this estimator can be useful even if errors are not uniform, but controlled in some way, and small, relative to the observed values. Examples of applications include problems in nuclear physics [10,11], parameter estimation of dynamic systems [12,13], statistical machine learning [14,15], and finance [16].
To solve the Chebyshev approximation problem, a number of approaches are known which apply various iterative computational procedures to find numerical solutions (see a comprehensive overview of the algorithmic solutions given by [8,17,18,19]). For instance, the approximation problems under consideration can be reduced to linear programs and then solved numerically by computational algorithms available in the linear programming, such as the simplex algorithm and its variations. For linear programming solutions and other related algorithms, one can consult early works [4,5,7,20,21,22,23,24], as well as more recent publications [25,26,27,28].
Along with existing iterative algorithms that find use in applications, direct analytical solutions of the linear Chebyshev approximation problem are also of interest as an essential instrument of formal analysis and treatment of the problem. A useful algebraic approach to derive direct solutions of problems that involve minimizing the Chebyshev distance is proposed in [29,30,31]. The approach offers complete solutions of the problems in the framework of tropical (idempotent) algebra, which deals with algebraic systems with idempotent operations. The solutions obtained in terms of tropical algebra are then represented in the usual form, ready for computation.
In this paper, we reshape and adjust algebraic techniques implemented in the above-mentioned approach to develop a direct solution of the discrete linear Chebyshev approximation problem in terms of conventional algebra. As a key component of the proposed method, an elimination lemma is proved that allows us to handle the problem by reducing it to a problem with one unknown parameter eliminated and a box constraint imposed on this parameter. To provide illuminating but not cumbersome examples of the application of the lemma, we derive direct solutions of problems of low dimension, formulated as linear regression problems with one and two parameters.
Furthermore, we construct a procedure to solve multidimensional approximation (multiple linear regression) problems. The procedure is based on a direct solution method that comprises two phases: backward elimination and forward determination (substitution) of the unknown parameters. The direct solution can supplement and complement the existing iterative procedures and becomes of particular interest when, for one reason or another, the use of iterative algorithms appears to be inappropriate or inadequate. We estimate computational complexity and memory requirements of the procedure, and implement symbolic computations in the MATLAB environment to obtain exact solutions for two illustrative numerical examples.
The rest of the paper is organized as follows. In Section 2, we formulate the approximation problem of interest in both scalar and vector form. Section 3 presents the main result, which offers a reduction step to separate the problem into a problem of lower dimension by eliminating an unknown parameter, and a box constraint for this parameter. We apply the obtained result in Section 4 to derive direct explicit solutions for linear regression problems with one and two unknown parameters. In Section 5, we describe a computational procedure to solve linear approximation problems of arbitrary dimension and discuss its computational complexity. In Section 6, software implementation is discussed and numerical examples are given. Section 7 presents concluding remarks.

2. Linear Chebyshev Approximation Problem

We start with an appropriate notation, preliminary assumptions, and formal representation of the discrete linear Chebyshev approximation problem under study.
Suppose that, given X i j , Y i R for all i = 1 , , M and j = 1 , , N , where M and N are positive integers, we need to find the unknown parameters θ j R for all j = 1 , , N that solve the minimax problem
min θ 1 , , θ N max 1 i M j = 1 N X i j θ j Y i .
Without loss of generality, we assume that for each j = 1 , , N , there exists at least one i, such that X i j 0 . Otherwise, if X i j = 0 for some j and all i, the parameter θ j does not affect the objective function, and thus can be removed.
Note that we can represent problem (1) in vector form by introducing the matrix and column vectors
X = ( X i j ) , Y = ( Y i ) , θ = ( θ j ) .
With this matrix-vector notation and the Chebyshev norm defined for any vector V = ( V i ) as
V = max i | V i | ,
the approximation problem takes the form
min θ X θ Y .
To solve problem (1), we first show that the problem can be reduced to a problem of the same form, but with one unknown parameter fewer.

3. Elimination Lemma

The next result offers a reduction approach to the problem, which provides the basis for the proposed solution.
Lemma 1.
Solving problem (1) is equivalent to solving the problem
min θ 1 , , θ N 1 max 1 i < k M | X i N | + | X k N | 0 j = 1 N 1 X i j X k N X k j X i N | X i N | + | X k N | θ j Y i X k N Y k X i N | X i N | + | X k N |
together with the inequality
max 1 i M X i N 0 μ | X i N | j = 1 N 1 X i j X i N θ j + Y i X i N θ N min 1 i M X i N 0 μ | X i N | j = 1 N 1 X i j X i N θ j + Y i X i N ,
where μ is the minimum of the objective function in problem (2), and the empty sums are defined to be zero.
Proof. 
To examine problem (1), we first introduce an auxiliary unknown parameter λ to rewrite the problem as follows:
min θ 1 , , θ N λ , s . t . max 1 i M j = 1 N X i j θ j Y i λ .
Note that the inequality constraint is readily represented as the system of inequalities
λ j = 1 N X i j θ j + Y i , λ j = 1 N X i j θ j Y i , i = 1 , , M ;
which, in particular, puts the problem into the form of a linear program.
Next, we continue rearrangement by solving for θ N those inequalities in which X i N 0 to write
θ N λ X i N + j = 1 N 1 X i j X i N θ j Y i X i N , θ N λ X i N j = 1 N 1 X i j X i N θ j + Y i X i N , X i N < 0 ; θ N λ X i N j = 1 N 1 X i j X i N θ j + Y i X i N , θ N λ X i N + j = 1 N 1 X i j X i N θ j Y i X i N , X i N > 0 ; i = 1 , , M .
Coupling the inequalities with common left-hand sides and adding the inequalities for X i N = 0 yield
θ N λ | X i N | + j = 1 N 1 X i j X i N θ j Y i X i N , θ N λ | X i N | j = 1 N 1 X i j X i N θ j + Y i X i N , X i N 0 ; λ j = 1 N 1 X i j θ j Y i , X i N = 0 ; i = 1 , , M .
By combining these inequalities for all i = 1 , , M , we obtain
θ N max 1 i M X i N 0 λ | X i N | + j = 1 N 1 X i j X i N θ j Y i X i N , θ N max 1 i M X i N 0 λ | X i N | j = 1 N 1 X i j X i N θ j + Y i X i N , λ max 1 i M X i N = 0 j = 1 N 1 X i j θ j Y i .
The first two inequalities at (4) result in the double inequality
max 1 i M X i N 0 λ | X i N | j = 1 N 1 X i j X i N θ j + Y i X i N θ N max 1 i M X i N 0 λ | X i N | + j = 1 N 1 X i j X i N θ j Y i X i N .
After replacing λ by μ that denotes the minimum of the objective function and using the equality max ( a , b ) = min ( a , b ) to change from max to min in the right-hand side, the double inequality takes the form of (3).
The above double inequality defines a nonempty set of values for the unknown θ N if, and only if, the following condition holds:
max 1 i M X i N 0 λ | X i N | j = 1 N 1 X i j X i N θ j + Y i X i N + max 1 i M X i N 0 λ | X i N | + j = 1 N 1 X i j X i N θ j Y i X i N 0 ,
which is readily rearranged in the form of the inequality
max 1 i , k M X i N , X k N 0 | X i N | + | X k N | | X i N | | X k N | λ j = 1 N 1 X i j X k N X k j X i N X i N X k N θ j + Y i X k N Y k X i N X i N X k N 0 .
This inequality is equivalent to the system of inequalities
| X i N | + | X k N | | X i N | | X k N | λ j = 1 N 1 X i j X k N X k j X i N X i N X k N θ j + Y i X k N Y k X i N X i N X k N 0 , X i N , X k N 0 ; 1 i , k M .
By solving these inequalities for λ , we obtain the system
λ | X i N | | X k N | X i N X k N j = 1 N 1 X i j X k N X k j X i N | X i N | + | X k N | θ j Y i X k N Y k X i N | X i N | + | X k N | , X i N , X k N 0 ; 1 i , k M .
We now note that interchanging the indices i and k in the differences
X i j X k N X k j X i N , Y i X k N Y k X i N
changes the sign of these differences, and hence, the sign of the entire right-hand side of each inequality in the system. As a result, for every pair of indices i and k, the system includes both the inequality
λ | X i N | | X k N | X i N X k N j = 1 N 1 X i j X k N X k j X i N | X i N | + | X k N | θ j Y i X k N Y k X i N | X i N | + | X k N | ,
and the inequality
λ | X k N | | X i N | X k N X i N j = 1 N 1 X k j X i N X i j X k N | X k N | + | X i N | θ j Y k X i N Y i X k N | X k N | + | X i N | = | X i N | | X k N | X i N X k N j = 1 N 1 X i j X k N X k j X i N | X i N | + | X k N | θ j Y i X k N Y k X i N | X i N | + | X k N | .
After coupling the paired inequalities and considering that the equality | X i N X k N | = | X i N | | X k N | is valid, we rearrange the system as follows:
λ j = 1 N 1 X i j X k N X k j X i N | X i N | + | X k N | θ j Y i X k N Y k X i N | X i N | + | X k N | , X i N , X k N 0 ; 1 i , k M .
Furthermore, we combine the inequalities for all 1 i , k M and add the last inequality at (4) to replace the condition X i N , X k N 0 by that in the form | X i N | + | X k N | 0 and rewrite the system as one inequality
λ max 1 i , k M | X i N | + | X k N | 0 j = 1 N 1 X i j X k N X k j X i N | X i N | + | X k N | θ j Y i X k N Y k X i N | X i N | + | X k N | .
We now observe that the term under the max operator is invariant under permutation of the indices i and k, and is equal to zero if i = k . Therefore, we can reduce the set of indices defined as 1 i , k M by that given by the condition 1 i < k M and represent the lower bound on λ as
λ max 1 i < k M | X i N | + | X k N | 0 j = 1 N 1 X i j X k N X k j X i N | X i N | + | X k N | θ j Y i X k N Y k X i N | X i N | + | X k N | .
Since the minimum of λ is bounded from below by the expression on the right-hand side, we need to find the minimum of this expression with respect to θ 1 , , θ N 1 , which leads to solving problem (2). □
To conclude this section, we note that the reduced problem at (2) has the same general form as problem (1) with the parameter θ N eliminated. This offers a potential for solving the problem under study by recurrent implementation of Lemma 1. We discuss application of the lemma to derive direct solutions of problems of low dimension and to develop a recursive procedure to solve problems of arbitrary dimension in what follows.

4. Solution of One- and Two-Parameter Regression Problems

We now apply the obtained result to derive direct, exact solutions to regression problems with one and two parameters. These solutions can be directly extended to problems with more parameters, which leads, however, to more complicated and bulky expressions, not presented here to save space.
We start with one-parameter simple linear regression problems, which have well-known solutions, and then find a complete solution for a two-parameter linear regression problem.

4.1. One-Parameter Linear Regression Problems

Let us suppose that, for given explanatory (independent) variables X i R and response (dependent) variables Y i R , where i = 1 , , M , we find the unknown regression parameter θ R that achieves the minimum
min θ max 1 i M | X i θ Y i | .
To solve the problem, we directly apply Lemma 1 with N = 1 . Elimination of the empty sums in (2) and (3), and substitution X i 1 = X i for all i = 1 , , M and θ 1 = θ yield the next results.
Proposition 1.
The minimum error in problem (5) is equal to
μ = max 1 i < k M | X i | + | X k | 0 | Y i X k Y k X i | | X i | + | X k | ,
and all solutions of the problem are given by the condition
max 1 i M X i 0 μ | X i | + Y i X i θ min 1 i M X i 0 μ | X i | + Y i X i .
We now consider a special case of problem (5) in the form
min θ max 1 i M | θ Y i | .
To handle the problem, we set X i = 1 for all i = 1 , , M in the expressions obtained in Proposition 1. Since | X i | + | X k | = 2 0 , the minimum error in problem (6) becomes
μ = max 1 i < k M | Y i Y k | / 2 = max 1 i , k M | Y i Y k | / 2 = max 1 i M Y i / 2 min 1 i M Y i / 2 .
The solution θ is given by the condition
μ + max 1 i M Y i θ μ + min 1 i M Y i ,
which, after substitution of the above expression for μ , leads to the unique result
θ = max 1 i M Y i / 2 + min 1 i M Y i / 2 .

4.2. Two-Parameter Linear Regression Problem

We now turn to two-parameter problems, which can be solved by twofold application of Lemma 1. To avoid cumbersome calculations, we concentrate on a special case in which, given variables X i , Y i R for all i = 1 , , M , our aim is to find the parameters θ 1 , θ 2 R to achieve
min θ 1 , θ 2 max 1 i M | θ 1 + X i θ 2 Y i | .
Proposition 2.
The minimum error in problem (7) is equal to
μ = max 1 i < k M | X i | + | X k | 0 max 1 p < r M X r X p | ( Y i X k Y k X i ) ( X r X p ) ( Y p X r Y r X p ) ( X k X i ) | ( | X i | + | X k | ) | X r X p | + ( | X p | + | X r | ) | X k X i | ,
and all solutions of the problem are given by the conditions
max 1 i < k M X k X i | X i | + | X k | | X k X i | μ + Y i X k Y k X i X k X i θ 1 min 1 i < k M X k X i | X i | + | X k | | X k X i | μ + Y i X k Y k X i X k X i ,
max 1 i M X i 0 μ | X i | θ 1 X i + Y i X i θ 2 min 1 i M X i 0 μ | X i | θ 1 X i + Y i X i .
Proof. 
We apply Lemma 1 with N = 2 , where we take X i 1 = 1 and X i 2 = X i for all i = 1 , , M . As a result, problem (7) reduces to the one-parameter problem
min θ 1 max 1 i < k M | X i | + | X k | 0 X k X i | X i | + | X k | θ 1 Y i X k Y k X i | X i | + | X k |
and box constraint for θ 2 in the form of the double inequality at (10), where μ is the minimum in the one-parameter problem.
We note that the objective function in the problem does not change if we replace the condition 1 i < k M for indices over which the maximum is taken, by the extended condition 1 i , k M .
In a similar way as in Lemma 1, we first represent the one-parameter problem under consideration as
min θ 1 λ , s . t . max 1 i , k M | X i | + | X k | 0 X k X i | X i | + | X k | θ 1 Y i X k Y k X i | X i | + | X k | λ .
The inequality constraint in this problem is equivalent to the system of inequalities
λ X k X i | X i | + | X k | θ 1 + Y i X k Y k X i | X i | + | X k | , λ X k X i | X i | + | X k | θ 1 Y i X k Y k X i | X i | + | X k | , | X i | + | X k | 0 ; 1 i , k M .
After solving the inequalities for θ 1 , we rewrite the system as
θ 1 | X i | + | X k | | X k X i | λ Y i X k Y k X i X k X i , θ 1 | X i | + | X k | | X k X i | λ + Y i X k Y k X i X i 1 X k X k 1 X i , X k X i ; λ | Y i X k Y k X i | | X i | + | X k | , X k = X i ; | X i | + | X k | 0 ; 1 i , k M .
By combining these inequalities, we obtain
θ 1 max 1 i , k M | X i | + | X k | 0 X k X i | X i | + | X k | | X k X i | λ Y i X k Y k X i X k X i , θ 1 max 1 i , k M | X i | + | X k | 0 X k X i | X i | + | X k | | X k X i | λ + Y i X k Y k X i X k X i , λ max 1 i , k M | X i | + | X k | 0 X k = X i | Y i X k Y k X i | | X i | + | X k | = max 1 i , k M | X i | 0 | Y i Y k | 2 .
The first two inequalities yield the double inequality
max 1 i , k M | X i | + | X k | 0 X k X i | X i | + | X k | | X k X i | λ + Y i X k Y k X i X k X i θ 1 min 1 i , k M | X i | + | X k | 0 X k X i | X i | + | X k | | X k X i | λ + Y i X k Y k X i X k X i .
Since, under the condition X k X i , the condition | X i | + | X k | 0 holds as well, we exclude the latter one. Observing that the terms under the max and min operators are invariant under permutation of i and k, we adjust the condition on indices to write the box constraint for θ 1 as (9).
The feasibility condition for the box constraint to be valid for θ 1 takes the form of the inequality
max 1 i , k M X k X i max 1 p , r M X r X p | X i | + | X k | | X k X i | + | X p | + | X r | | X r X p | λ + Y i X k Y k X i X k X i Y p X r Y r X p X r X p 0 .
As before, we represent this inequality as the system of inequalities for each i , k and p , r , and then solve these inequalities for λ . After combining the solutions back into one inequality and adding the last inequality at (11), we obtain
λ max 1 i , k M | X i | + | X k | 0 max 1 p , r M X r X p | ( Y i X k Y k X i ) ( X r X p ) ( Y p X r Y r X p ) ( X k X i ) | ( | X i | + | X k | ) | X r X p | + ( | X p | + | X r | ) | X k X i | ,
where the expression on the right-hand side determines the minimum μ .
Finally, we note that the fractional term under maximization is invariant with respect to interchanging the indices i and k, as well as p and r. Therefore, we can replace the conditions 1 i , k M and 1 p , r M by the conditions 1 i < k M and 1 p < r M , which yields the representation for μ in the form of (8). □

5. General Solution Procedure

We now use Lemma 1 to derive a complete solution of problem (1) by performing a series of reduction steps, each eliminating an unknown parameter in the problem and determining a box constraint for this parameter. We observe that the elimination of a parameter from the objective function as described in Lemma 1 leaves the general form of the problem unchanged. Therefore, we can repeat the elimination over and over again until the function has no more parameters, and thus becomes a constant that shows the minimum of the objective function in the initial problem.
At the same time, together with the elimination of a parameter from the objective function, Lemma 1 offers a box constraint for this parameter, represented in terms of those parameters which are retained in the function. We see that the last constraint does not depend on any other parameters, and thus is directly given by a double inequality with constant values on both sides. As a result, we can take the box constraints in the order from the last constraint to the first, which yields a system of double inequalities that completely determines the solution set of the problem.
We are in a position to describe the solution procedure formally in more detail. The procedure follows a direct solution method that examines the unknown parameters in reversal order, starting from the parameter θ N and going backward to the parameter θ 1 . Let n be the number of parameters in the objective function in the current step of the procedure.
Initially, we take n = N and set M n = M . For all i = 1 , , M n and j = 1 , , n , we define
X i j n = X i j , Y i n = Y i .
We also introduce the matrix-vector notation
X n = X 11 n X 1 n n X M n , 1 n X M n , n n , Y n = Y 1 n Y M n n , θ n = θ 1 θ n .
For each n = N , N 1 , , 1 , the procedure produces a two-fold outcome: the reduction of the current problem by eliminating an unknown parameter, and the derivation of a box constraint for the eliminated parameter.

5.1. Elimination of Parameters

Assuming that the norm sign in what follows stands for the Chebyshev norm, we start with eliminating the parameter θ n from the problem
min θ 1 , , θ n max 1 i M n j = 1 n X i j n θ j Y i n = min θ n X n θ n Y n .
It follows from Lemma 1 that, as a result of this elimination, the problem reduces, if n > 1 , to the problem
min θ 1 , , θ n 1 max 1 i M n 1 j = 1 n 1 X i j n 1 θ j Y i n 1 = min θ n 1 X n 1 θ n 1 Y n 1 ,
or degenerates, if n = 1 , to the constant
max 1 i M 0 Y i 0 = Y 0 .
We now exploit the representation of problem (2) to establish formulas of recalculating the objective function when changing to the reduced problem.
First, we consider the condition for indices in (2), which takes the form 1 i < k M n . We assume that the pairs of indices ( i , k ) defined by the condition are listed in the order of the sequence
( 1 , 2 ) , , ( 1 , M n ) , ( 2 , 3 ) , , ( 2 , M n ) , , ( M n 1 , M n ) .
It is not difficult to verify by direct substitution that each fixed pair ( i , k ) in this sequence has the number (index) calculated as
M n ( i 1 ) i ( i 1 ) / 2 + k 1 .
Furthermore, we use (2) to define, for all i and k such that 1 i < k M n and for all j = 1 , , n 1 , the recurrent formulas
X M n ( i 1 ) i ( i 1 ) / 2 + k 1 , j n 1 = X i j n X k n n X k j n X i n n | X i n n | + | X k n n | , if | X i n n | + | X k n n | 0 ; 0 otherwise ; Y M n ( i 1 ) i ( i 1 ) / 2 + k 1 n 1 = Y i n X k n n Y k n X i n n | X i n n | + | X k n n | , if | X i n n | + | X k n n | 0 ; 0 otherwise .
Note that if | X i n n | + | X k n n | = 0 , then the above formulas produce a row of zeros that corresponds to a zero term, which does not contribute to the objective function. We assume that all such zero rows are removed, and the rest of the rows are renumbered (reindexed) to preserve continual enumeration.
We denote the number of nonzero rows by M n 1 and observe that
M n 1 M n ( M n 1 ) / 2 .
Finally, we take the numbers X i j n 1 and Y i n 1 with i = 1 , , M n 1 and j = 1 , , n 1 to form the matrix and vector
X n 1 = ( X i j n 1 ) , Y n 1 = ( Y i n 1 ) ,
which completely determine the objective function in the reduced problem. Specifically, the reduced problem for n = 1 degenerates into the constant
μ = Y 0 ,
representing the minimum of the objective function in the initial problem.

5.2. Derivation of Box Constraints

We take the double inequality at (3) and adjust it to write the box constraint for the parameter θ n in the form
max 1 i M n X i n n 0 μ | X i n n | j = 1 n 1 X i j n X i n n θ j + Y i n X i n n θ n min 1 i M n X i n n 0 μ | X i n n | j = 1 n 1 X i j n X i n n θ j + Y i n X i n n ,
where μ denotes the minimum value of the objective function in problem (1).
To represent this inequality constraint in vector form, we introduce the following notation. First, for all i = 1 , , M n such that X i n n 0 and all j = 1 , , n 1 , we define
T i j n = X i j n X i n n , L i n = Y i n X i n n μ | X i n n | , U i n = Y i n X i n n + μ | X i n n | .
We note that all indices i with X i n n = 0 are not taken into account when calculating the maximum and minimum in the double inequality, and hence are excluded from the index set 1 , , M n . Assuming that the rest of indices are renumbered to preserve continual enumeration, we introduce the matrix and column vectors
T n = ( T i j n ) , L n = ( L i n ) , U n = ( U i n ) .
With this matrix-vector notation, we write the box constraint, if n > 1 , as the double inequality
max ( L n T n θ n 1 ) θ n min ( U n T n θ n 1 ) ,
and, if n = 1 , as the inequality
max ( L 1 ) θ 1 min ( U 1 ) ,
where max and min symbols are thought of as operators that calculate the maximum and minimum over all elements of corresponding vectors.

5.3. Solution Algorithm

We summarize the above consideration in the form of a computational algorithm to solve problem (1) in a finite number of steps. The algorithm includes two sequential phases: backward elimination and forward determination (substitution) of the unknown parameters.
The backward elimination starts with n = N by setting M n = M and
X n = X , Y n = Y .
Furthermore, for each n = N , , 1 , the matrix X n and vector Y n are used as described above to obtain the values of X n 1 and Y n 1 if n > 1 or the value of Y 0 if n = 1 . As supplementary results, the matrices T n are also evaluated from the matrices X n .
The backward elimination completes at n = 1 by calculating the minimum value of the objective function, given by μ = Y 0 .
The forward determination first uses the obtained minimum μ , matrix X 1 and vector Y 1 to calculate the vectors L 1 and U 1 and then evaluate the box constraint for the unknown θ 1 in the form of double inequality at (16). Then, for each n = 2 , , N , the vectors L n and U n are calculated from μ , X n and Y n to represent the box constraints for θ n as in (15).
Note that the bounds in the box constraint for the parameter θ 1 are explicitly defined by constants, whereas the bounds for each parameter θ n with n > 1 are defined as linear functions of the previous parameters θ 1 , , θ n 1 . Therefore, we can first fix a value to satisfy the box constraint for θ 1 and then substitute this value into the box constraint for θ 2 to obtain explicit bounds given by constants. By repeating such calculations to fix a value for a parameter with explicit bounds and to evaluate bounds for the next parameter, we can successively determine a solution of the problem.

5.4. Computational Complexity

The most computationally intensive and memory demanding component of the algorithm, which determines the overall rate of computational complexity and memory requirement, is the calculation of entries in the matrices X n and vectors Y n for all n = N 1 , N 2 , , 1 , and vector Y 0 by using (12). Though calculating one entry involves a few simple operations, the number of all entries grows very fast as M and N increase.
To derive a rough estimate for the number of entries in all matrices, we first evaluate the number of rows in each matrix. Assuming that the matrix X N = X has M rows, we see that the number of rows in X N 1 is bounded from above by
M N 1 M N ( M N 1 ) / 2 = M ( M 1 ) / 2 < M 2 / 2 = 2 ( M / 2 ) 2 .
Recursive application of this estimate yields an upper bound for the number of rows in the matrices X N l for each l = 1 , , N 1 in the form
M N l M N l + 1 ( M N l + 1 1 ) / 2 < M N l + 1 2 / 2 < 2 ( M / 2 ) 2 l .
At the last step with l = N , we calculate the vector Y 0 , in which the number of entries is no more than
M 0 M 1 ( M 1 1 ) / 2 < M 1 2 / 2 < 2 ( M / 2 ) 2 N .
Since we have n + 1 columns in the matrix X n together with the vector Y n , the overall number of the entries in all steps can be estimated as
l = 1 N ( N l + 1 ) M N l < l = 1 N 2 ( N l + 1 ) ( M / 2 ) 2 l .
We denote the upper bound on the right-hand side by C ( N , M ) and observe that this bound increases polynomially with respect to M, and double exponentially with N. For problems with few parameters, the value of C ( N , M ) seems to be rather acceptable. Specifically, in the three-parameter case with N = 3 and M = 10 , 20 , 50 , we have C ( 3 , 10 ) = 783 , 900 , C ( 3 , 20 ) = 200 , 040 , 600 , and C ( 3 , 50 ) = 305 , 177 , 347 , 500 . A further increase of the number of parameters N results in a rapid rise in the value of C ( N , M ) , as the following examples show: C ( 4 , 10 ) = 305 , 177 , 347 , 700 , C ( 5 , 10 ) = 610 , 353 , 911 , 500 .
Note that the actual number of entries to calculate is fewer than that given by the bound C ( N , M ) . As it follows from (12), this number can be further reduced if the matrices X n , or at least the initial matrix X N = X , have many zero entries (sparse matrices). It is clear that, if a matrix has zero entries in a column other than the last one, the columns (and related parameters) can be renumbered to put the column with zero entries on the last place where zero entries can reduce computations. As an example of the case with good chances of having sparse matrices X n , one can consider problems where the initial matrix X has entries that take values from the set { 1 , 0 , 1 } .
Finally, we observe that the calculations by Formula (12) can be performed for different entries quite independently, which offers strong potential for parallel implementation of the procedure on parallel and vector computers to provide more computational and memory resources, and hence to extend applicability to problems of higher dimensions.

6. Software Implementation and Numerical Examples

We conclude with comments on software implementation of the solution procedure and illustrative numerical examples of low dimensions that demonstrate the computational technique involved in the solution. For the sake of illustration, we concentrate on application to problems with exact input data given by integer (rational) numbers to find explicit rational solutions.
To obtain exact solutions, the procedure has been coded for serial symbolic computations in the MATLAB (Release R2020b) environment as a collection of functions that calculate all intermediate matrices and vectors, as well as provide the overall functionality of the algorithm. The numerical experiments were conducted on a custom computer with a 4-core 8-thread Intel Xeon E3-1231 v3 CPU at 3.40GHz and 32GB of DDR3 RAM, running Windows 10 Enterprise 64-bit OS.
Example 1.
Let us take N = 3 and M = 4 and consider the approximation (regression) problem with
X = 3 1 2 1 2 2 2 3 1 0 2 1 , Y = 2 1 1 0 .
We start with the backward elimination phase by setting X 3 = X and Y 3 = Y . We use (12) to calculate the entries in
X 2 = 2 1 / 2 1 / 3 5 / 3 1 1 5 / 3 4 / 3 1 / 3 2 / 3 1 1 / 2 , Y 2 = 1 / 2 0 2 / 3 1 / 3 1 / 3 1 / 2 ,
and then the entries in
X 1 = 21 / 13 1 21 / 11 9 / 7 3 / 2 3 / 4 7 / 9 1 / 7 9 / 13 9 / 7 3 / 5 1 1 / 3 3 / 11 3 / 7 , Y 1 = 5 / 13 1 / 9 5 / 11 1 / 7 1 / 2 5 / 12 5 / 27 5 / 21 5 / 13 11 / 21 1 / 15 5 / 9 1 / 3 3 / 11 3 / 7 .
Next, we obtain the vector Y 0 , which appears to have 105 entries and thus is not shown here to save space. Evaluating the maximum entry of Y 0 as the minimum of the objective function according to (13) yields
μ = 2 / 7 .
In parallel with evaluating the entries of X n and Y n , we apply (14) to find the entries in
T 3 = 3 / 2 1 / 2 1 / 2 1 2 3 0 2 , T 2 = 4 1 / 5 1 5 / 4 1 / 2 2 .
Substitution of the obtained minimum μ = 2 / 7 yields
L 3 = 6 / 7 5 / 14 5 / 7 2 / 7 , U 3 = 8 / 7 9 / 14 9 / 7 2 / 7 , L 2 = 3 / 7 6 / 35 8 / 21 13 / 28 1 / 14 11 / 7 , U 2 = 11 / 7 6 / 35 20 / 21 1 / 28 13 / 14 3 / 7 .
Finally, we calculate the vectors
L 1 = 3 / 49 11 / 63 13 / 147 1 / 9 1 / 7 11 / 63 19 / 147 11 / 3 1 / 7 5 / 27 23 / 63 17 / 63 1 / 7 1 / 21 1 / 3 , U 1 = 61 / 147 25 / 63 19 / 49 1 / 3 11 / 21 59 / 63 89 / 147 1 / 3 61 / 63 17 / 27 37 / 63 53 / 63 13 / 7 43 / 21 5 / 3 .
The forward determination (substitution) phase of the procedure involves application of (16) and then (15) to obtain the unique solution
θ 1 = 1 / 3 , θ 2 = 5 / 21 , θ 3 = 16 / 21 .
The actual number of entries in the matrices X n and vectors Y n to calculate in the problem is 153, which is fewer than the upper bound given by C ( 3 , 4 ) = 600 . The computer time to obtain the exact solution by symbolic computation averages 2.3 s.
Example 2.
Consider the problem with N = 3 and M = 10 , where the input data are given by
X = 3 1 2 1 2 2 2 3 1 0 2 1 1 2 1 3 1 0 1 1 1 1 1 2 0 3 1 2 1 0 , Y = 2 1 1 0 1 1 2 0 1 3 .
The application of the procedure yields the unique solution
θ 1 = 1 / 9 , θ 2 = 13 / 18 , θ 3 = 8 / 9 .
The matrices X n and vectors Y n involved in calculations have 444 , 280 entries (while the corresponding upper bound is C ( 3 , 10 ) = 783 , 900 ). The symbolic computations take about 52 min of computer time.

7. Conclusions

A direct computational technique has been proposed to solve discrete linear Chebyshev approximation problems, which find wide application in various areas, including the least maximum absolute deviation regression in statistics. First, we have shown that the problem under consideration can be reduced by eliminating an unknown parameter to a problem with less number of unknowns and a box constraint for the parameter eliminated. This result was used to obtain direct solutions to linear regression problems with one and two parameters.
To solve approximation problems of arbitrary dimension, we have developed a direct solution procedure that consists of two parts: backward elimination and forward substitution of the unknown parameters. The direct solution is of particular interest in the problems when analytical solutions are desired, whereas the use of iterative algorithms may be inappropriate or inadequate. We have estimated the computational complexity of the procedure, discussed its MATLAB implementation intended to provide exact solutions by symbolic computations, and presented numerical examples.
Possible lines of further investigation can include modification and improvement of the algorithm to reduce computational complexity in solving problems in both exact (rational) and inexact (floating point) form. The development of parallel implementations of the algorithm to speed up calculations is also of interest.

Funding

This work was supported in part by the Russian Foundation for Basic Research grant number 20-010-00145.

Acknowledgments

The author is very grateful to three referees for their valuable comments and suggestions, which have been incorporated into the revised version of the manuscript.

Conflicts of Interest

The author declares no conflict of interest.

References

  1. de Laplace, P.S. Mécanique Céleste; Hillard, Gray, Littlè, and Wilkins: Boston, MA, USA, 1832; Volume 2, (Engl. Transl. with Comment. by N. Bowditch). [Google Scholar]
  2. Harter, H.L. The method of least squares and some alternatives: Part I. Int. Stat. Rev. 1974, 42, 147–174. [Google Scholar] [CrossRef]
  3. Steffens, K.G. The History of Approximation Theory; Birkhäuser: Boston, MA, USA, 2006. [Google Scholar] [CrossRef]
  4. Tewarson, R.P. On minimax solutions of linear equations. Comput. J. 1972, 15, 277–279. [Google Scholar] [CrossRef]
  5. Appa, G.; Smith, C. On L1 and Chebyshev estimation. Math. Program. 1973, 5, 73–87. [Google Scholar] [CrossRef]
  6. Pĭnar, M.Ç. Overdetermined systems of linear equations. In Encyclopedia of Optimization, 2nd ed.; Floudas, C.A., Pardalos, P.M., Eds.; Springer: Boston, MA, USA, 2009; pp. 2893–2896. [Google Scholar] [CrossRef] [Green Version]
  7. Rabinowitz, P. Applications of linear programming to numerical analysis. SIAM Rev. 1968, 10, 121–159. [Google Scholar] [CrossRef]
  8. Hand, M.L. Aspects of Linear Regression Estimation Under the Criterion of Minimizing the Maximum Absolute Residual. Ph.D. Thesis, Iowa State University, Ames, IA, USA, 1978. [Google Scholar]
  9. Kennedy, W.J.; Gentle, J.E. Statistical Computing; Statistics: Textbooks and Monographs; Dekker: New York, NY, USA, 1980; Volume 33. [Google Scholar]
  10. James, F. Fitting tracks in wire chambers using the Chebyshev norm instead of least squares. Nucl. Instr. Meth. 1983, 211, 145–152. [Google Scholar] [CrossRef] [Green Version]
  11. Bertsch, G.F.; Sabbey, B.; Uusnäkki, M. Fitting theories of nuclear binding energies. Phys. Rev. C 2005, 71, 054311. [Google Scholar] [CrossRef] [Green Version]
  12. Mäkilä, P.M. Robust identification and Galois sequences. Internat. J. Control 1991, 54, 1189–1200. [Google Scholar] [CrossRef]
  13. Akçay, H.; Hjalmarsson, H.; Ljung, L. On the choice of norms in system identification. IEEE Trans. Automat. Control 1996, 41, 1367–1372. [Google Scholar] [CrossRef]
  14. Schölkopf, B.; Smola, A.J. Learning with Kernels; Adaptive Computation and Machine Learning; MIT Press: Cambridge, MA, USA, 2002; p. 626. [Google Scholar]
  15. Bartoszuk, M.; Beliakov, G.; Gagolewski, M.; James, S. Fitting aggregation functions to data: Part I—Linearization and regularization. In Information Processing and Management of Uncertainty in Knowledge-Based Systems; Carvalho, J.P., Lesot, M.J., Kaymak, U., Vieira, S., Bouchon-Meunier, B., Yager, R.R., Eds.; Springer: Cham, Switzerland, 2016; Volume 611, pp. 767–779. [Google Scholar] [CrossRef]
  16. Jaschke, S.R. Arbitrage bounds for the term structure of interest rates. Finance Stoch. 1997, 2, 29–40. [Google Scholar] [CrossRef]
  17. Harter, H.L. The method of least squares and some alternatives: Part III. Int. Stat. Rev. 1975, 43, 1–44. [Google Scholar] [CrossRef]
  18. Harter, H.L. The method of least squares and some alternatives: Part IV. Int. Stat. Rev. 1975, 43, 125–190. [Google Scholar] [CrossRef]
  19. Späth, H. Mathematical Algorithms for Linear Regression; Computer Science and Scientific Computing; Academic Press, Inc.: San Diego, CA, USA, 1992; p. 338. [Google Scholar]
  20. Wagner, H.M. Linear programming techniques for regression analysis. J. Am. Statist. Assoc. 1959, 54, 206–212. [Google Scholar] [CrossRef]
  21. Stiefel, E. Note on Jordan elimination, linear programming and Tchebycheff approximation. Numer. Math. 1960, 2, 1–17. [Google Scholar] [CrossRef]
  22. Watson, G.A. On the best linear one-sided Chebyshev approximation. J. Approx. Theory 1973, 7, 48–58. [Google Scholar] [CrossRef] [Green Version]
  23. Sposito, V.A. Minimizing the maximum absolute deviation. ACM SIGMAP Bull. 1976, 51–53. [Google Scholar] [CrossRef]
  24. Armstrong, R.D.; Kung, D.S. Algorithm AS 135: Min-max estimates for a linear multiple regression problem. J. R. Stat. Soc. Ser. C. Appl. Stat. 1979, 28, 93–100. [Google Scholar] [CrossRef]
  25. Kim, B.Y. Algorithm for the constrained Chebyshev estimation in linear regression. Commun. Stat. Appl. Methods 2000, 7, 47–54. [Google Scholar]
  26. Boyd, S.; Vandenberghe, L. Convex Optimization; Cambridge University Press: Cambridge, UK, 2004. [Google Scholar]
  27. Castillo, E.; Mínguez, R.; Castillo, C.; Cofiño, A.S. Dealing with the multiplicity of solutions of the 1 and regression models. Eur. J. Oper. Res. 2008, 188, 460–484. [Google Scholar] [CrossRef]
  28. Ene, A.; Vladu, A. Improved convergence for 1 and regression via iteratively reweighted least squares. In Proceedings of the 36th International Conference on Machine Learning Research, Long Beach, CA, USA, 10–15 June 2019; Chaudhuri, K., Salakhutdinov, R., Eds.; Volume 97, pp. 1794–1801. [Google Scholar]
  29. Krivulin, N. Complete solution of a constrained tropical optimization problem with application to location analysis. In Relational and Algebraic Methods in Computer Science; Lecture Notes in Computer Science; Höfner, P., Jipsen, P., Kahl, W., Müller, M.E., Eds.; Springer: Cham, Switzerland, 2014; Volume 8428, pp. 362–378. [Google Scholar] [CrossRef] [Green Version]
  30. Krivulin, N. Algebraic solution of weighted minimax single-facility constrained location problems. In Relational and Algebraic Methods in Computer Science; Lecture Notes in Computer Science; Desharnais, J., Guttmann, W., Joosten, S., Eds.; Springer: Cham, Switzerland, 2018; Volume 11194, pp. 317–332. [Google Scholar] [CrossRef]
  31. Krivulin, N. Algebraic solution of minimax single-facility constrained location problems with Chebyshev and rectilinear distances. J. Log. Algebr. Methods Program. 2020, 115, 100578. [Google Scholar] [CrossRef]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Krivulin, N. Using Parameter Elimination to Solve Discrete Linear Chebyshev Approximation Problems. Mathematics 2020, 8, 2210. https://doi.org/10.3390/math8122210

AMA Style

Krivulin N. Using Parameter Elimination to Solve Discrete Linear Chebyshev Approximation Problems. Mathematics. 2020; 8(12):2210. https://doi.org/10.3390/math8122210

Chicago/Turabian Style

Krivulin, Nikolai. 2020. "Using Parameter Elimination to Solve Discrete Linear Chebyshev Approximation Problems" Mathematics 8, no. 12: 2210. https://doi.org/10.3390/math8122210

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop