Using Parameter Elimination to Solve Discrete Linear Chebyshev Approximation Problems

Krivulin, Nikolai

doi:10.3390/math8122210

Open AccessArticle

Using Parameter Elimination to Solve Discrete Linear Chebyshev Approximation Problems

by

Nikolai Krivulin

Faculty of Mthematics and Mechanics, St. Petersburg State University, Universitetskaya Emb. 7/9, 199034 St. Petersburg, Russia

Mathematics 2020, 8(12), 2210; https://doi.org/10.3390/math8122210

Submission received: 18 November 2020 / Revised: 7 December 2020 / Accepted: 9 December 2020 / Published: 13 December 2020

(This article belongs to the Special Issue Approximation Theory and Methods 2020)

Download Versions Notes

Abstract

We consider discrete linear Chebyshev approximation problems in which the unknown parameters of linear function are fitted by minimizing the least maximum absolute deviation of errors. Such problems find application in the solution of overdetermined systems of linear equations that appear in many practical contexts. The least maximum absolute deviation estimator is used in regression analysis in statistics when the distribution of errors has bounded support. To derive a direct solution of the problem, we propose an algebraic approach based on a parameter elimination technique. As a key component of the approach, an elimination lemma is proved to handle the problem by reducing it to a problem with one parameter eliminated, together with a box constraint imposed on this parameter. We demonstrate the application of the lemma to the direct solution of linear regression problems with one and two parameters. We develop a procedure to solve multidimensional approximation (multiple linear regression) problems in a finite number of steps. The procedure follows a method that comprises two phases: backward elimination and forward substitution of parameters. We describe the main components of the procedure and estimate its computational complexity. We implement symbolic computations in MATLAB to obtain exact solutions for two numerical examples.

Keywords:

discrete linear Chebyshev approximation; minimax problem; variable elimination; direct solution; multiple linear regression; least maximum absolute deviation estimator

MSC:

41A50; 90C47; 62J05

1. Introduction

Discrete linear Chebyshev (minimax) approximation problems where the errors of fitting the unknown parameters are measured by the Chebyshev (max, infinity, uniform or

L_{\infty}

) norm are of theoretical interest and practical importance in many areas of science and engineering. Application of the Chebyshev norm leads to the least maximum absolute deviation of errors as the approximation criterion, and dates back to Laplace’s classical work [1] (book 3, chap. V, §39) (see also [2,3]).

An important area of applications of the discrete linear Chebyshev approximation is the solution of overdetermined systems of linear equations [4,5,6] that appear in many practical contexts. The least maximum absolute deviation estimator is widely used in regression analysis in statistics when the distribution of errors has bounded support. Specifically, the Chebyshev estimator is known to be a maximum likelihood estimator if the error distribution is uniform [5,7,8,9]. Moreover, this estimator can be useful even if errors are not uniform, but controlled in some way, and small, relative to the observed values. Examples of applications include problems in nuclear physics [10,11], parameter estimation of dynamic systems [12,13], statistical machine learning [14,15], and finance [16].

To solve the Chebyshev approximation problem, a number of approaches are known which apply various iterative computational procedures to find numerical solutions (see a comprehensive overview of the algorithmic solutions given by [8,17,18,19]). For instance, the approximation problems under consideration can be reduced to linear programs and then solved numerically by computational algorithms available in the linear programming, such as the simplex algorithm and its variations. For linear programming solutions and other related algorithms, one can consult early works [4,5,7,20,21,22,23,24], as well as more recent publications [25,26,27,28].

Along with existing iterative algorithms that find use in applications, direct analytical solutions of the linear Chebyshev approximation problem are also of interest as an essential instrument of formal analysis and treatment of the problem. A useful algebraic approach to derive direct solutions of problems that involve minimizing the Chebyshev distance is proposed in [29,30,31]. The approach offers complete solutions of the problems in the framework of tropical (idempotent) algebra, which deals with algebraic systems with idempotent operations. The solutions obtained in terms of tropical algebra are then represented in the usual form, ready for computation.

In this paper, we reshape and adjust algebraic techniques implemented in the above-mentioned approach to develop a direct solution of the discrete linear Chebyshev approximation problem in terms of conventional algebra. As a key component of the proposed method, an elimination lemma is proved that allows us to handle the problem by reducing it to a problem with one unknown parameter eliminated and a box constraint imposed on this parameter. To provide illuminating but not cumbersome examples of the application of the lemma, we derive direct solutions of problems of low dimension, formulated as linear regression problems with one and two parameters.

Furthermore, we construct a procedure to solve multidimensional approximation (multiple linear regression) problems. The procedure is based on a direct solution method that comprises two phases: backward elimination and forward determination (substitution) of the unknown parameters. The direct solution can supplement and complement the existing iterative procedures and becomes of particular interest when, for one reason or another, the use of iterative algorithms appears to be inappropriate or inadequate. We estimate computational complexity and memory requirements of the procedure, and implement symbolic computations in the MATLAB environment to obtain exact solutions for two illustrative numerical examples.

The rest of the paper is organized as follows. In Section 2, we formulate the approximation problem of interest in both scalar and vector form. Section 3 presents the main result, which offers a reduction step to separate the problem into a problem of lower dimension by eliminating an unknown parameter, and a box constraint for this parameter. We apply the obtained result in Section 4 to derive direct explicit solutions for linear regression problems with one and two unknown parameters. In Section 5, we describe a computational procedure to solve linear approximation problems of arbitrary dimension and discuss its computational complexity. In Section 6, software implementation is discussed and numerical examples are given. Section 7 presents concluding remarks.

2. Linear Chebyshev Approximation Problem

We start with an appropriate notation, preliminary assumptions, and formal representation of the discrete linear Chebyshev approximation problem under study.

Suppose that, given

X_{i j}, Y_{i} \in R

for all

i = 1, \dots, M

and

j = 1, \dots, N

, where M and N are positive integers, we need to find the unknown parameters

θ_{j} \in R

for all

j = 1, \dots, N

that solve the minimax problem

\begin{matrix} min_{θ_{1}, \dots, θ_{N}} & max_{1 \leq i \leq M} |\sum_{j = 1}^{N} X_{i j} θ_{j} - Y_{i}| . \end{matrix}

(1)

Without loss of generality, we assume that for each

j = 1, \dots, N

, there exists at least one i, such that

X_{i j} \neq 0

. Otherwise, if

X_{i j} = 0

for some j and all i, the parameter

θ_{j}

does not affect the objective function, and thus can be removed.

Note that we can represent problem (1) in vector form by introducing the matrix and column vectors

X = (X_{i j}), Y = (Y_{i}), θ = (θ_{j}) .

With this matrix-vector notation and the Chebyshev norm defined for any vector

V = (V_{i})

as

∥ V ∥ = max_{i} | V_{i} |,

the approximation problem takes the form

\begin{matrix} min_{θ} & ∥ X θ - Y ∥ . \end{matrix}

To solve problem (1), we first show that the problem can be reduced to a problem of the same form, but with one unknown parameter fewer.

3. Elimination Lemma

The next result offers a reduction approach to the problem, which provides the basis for the proposed solution.

Lemma 1.

Solving problem (1) is equivalent to solving the problem

\begin{matrix} min_{θ_{1}, \dots, θ_{N - 1}} & max_{\begin{matrix} 1 \leq i < k \leq M \\ | X_{i N} | + | X_{k N} | \neq 0 \end{matrix}} |\sum_{j = 1}^{N - 1} \frac{X_{i j} X_{k N} - X_{k j} X_{i N}}{| X_{i N} | + | X_{k N} |} θ_{j} - \frac{Y_{i} X_{k N} - Y_{k} X_{i N}}{| X_{i N} | + | X_{k N} |}| \end{matrix}

(2)

together with the inequality

max_{\begin{matrix} 1 \leq i \leq M \\ X_{i N} \neq 0 \end{matrix}} (- \frac{μ}{| X_{i N} |} - \sum_{j = 1}^{N - 1} \frac{X_{i j}}{X_{i N}} θ_{j} + \frac{Y_{i}}{X_{i N}}) \leq θ_{N} \leq min_{\begin{matrix} 1 \leq i \leq M \\ X_{i N} \neq 0 \end{matrix}} (\frac{μ}{| X_{i N} |} - \sum_{j = 1}^{N - 1} \frac{X_{i j}}{X_{i N}} θ_{j} + \frac{Y_{i}}{X_{i N}}),

(3)

where μ is the minimum of the objective function in problem (2), and the empty sums are defined to be zero.

Proof.

To examine problem (1), we first introduce an auxiliary unknown parameter

λ

to rewrite the problem as follows:

\begin{matrix} min_{θ_{1}, \dots, θ_{N}} & λ, \\ s . t . & max_{1 \leq i \leq M} |\sum_{j = 1}^{N} X_{i j} θ_{j} - Y_{i}| \leq λ . \end{matrix}

Note that the inequality constraint is readily represented as the system of inequalities

\begin{matrix} λ & \geq - \sum_{j = 1}^{N} X_{i j} θ_{j} + Y_{i}, \\ λ & \geq \sum_{j = 1}^{N} X_{i j} θ_{j} - Y_{i}, i = 1, \dots, M; \end{matrix}

which, in particular, puts the problem into the form of a linear program.

Next, we continue rearrangement by solving for

θ_{N}

those inequalities in which

X_{i N} \neq 0

to write

\begin{matrix} - θ_{N} & \geq \frac{λ}{X_{i N}} + \sum_{j = 1}^{N - 1} \frac{X_{i j}}{X_{i N}} θ_{j} - \frac{Y_{i}}{X_{i N}}, \\ θ_{N} & \geq \frac{λ}{X_{i N}} - \sum_{j = 1}^{N - 1} \frac{X_{i j}}{X_{i N}} θ_{j} + \frac{Y_{i}}{X_{i N}}, X_{i N} < 0; \\ θ_{N} & \geq - \frac{λ}{X_{i N}} - \sum_{j = 1}^{N - 1} \frac{X_{i j}}{X_{i N}} θ_{j} + \frac{Y_{i}}{X_{i N}}, \\ - θ_{N} & \geq - \frac{λ}{X_{i N}} + \sum_{j = 1}^{N - 1} \frac{X_{i j}}{X_{i N}} θ_{j} - \frac{Y_{i}}{X_{i N}}, X_{i N} > 0; i = 1, \dots, M . \end{matrix}

Coupling the inequalities with common left-hand sides and adding the inequalities for

X_{i N} = 0

yield

\begin{matrix} - θ_{N} & \geq - \frac{λ}{| X_{i N} |} + \sum_{j = 1}^{N - 1} \frac{X_{i j}}{X_{i N}} θ_{j} - \frac{Y_{i}}{X_{i N}}, \\ θ_{N} & \geq - \frac{λ}{| X_{i N} |} - \sum_{j = 1}^{N - 1} \frac{X_{i j}}{X_{i N}} θ_{j} + \frac{Y_{i}}{X_{i N}}, X_{i N} \neq 0; \\ λ & \geq |\sum_{j = 1}^{N - 1} X_{i j} θ_{j} - Y_{i}|, X_{i N} = 0; i = 1, \dots, M . \end{matrix}

By combining these inequalities for all

i = 1, \dots, M

, we obtain

\begin{matrix} - θ_{N} & \geq max_{\begin{matrix} 1 \leq i \leq M \\ X_{i N} \neq 0 \end{matrix}} (- \frac{λ}{| X_{i N} |} + \sum_{j = 1}^{N - 1} \frac{X_{i j}}{X_{i N}} θ_{j} - \frac{Y_{i}}{X_{i N}}), \\ θ_{N} & \geq max_{\begin{matrix} 1 \leq i \leq M \\ X_{i N} \neq 0 \end{matrix}} (- \frac{λ}{| X_{i N} |} - \sum_{j = 1}^{N - 1} \frac{X_{i j}}{X_{i N}} θ_{j} + \frac{Y_{i}}{X_{i N}}), \\ λ & \geq max_{\begin{matrix} 1 \leq i \leq M \\ X_{i N} = 0 \end{matrix}} |\sum_{j = 1}^{N - 1} X_{i j} θ_{j} - Y_{i}| . \end{matrix}

(4)

The first two inequalities at (4) result in the double inequality

max_{\begin{matrix} 1 \leq i \leq M \\ X_{i N} \neq 0 \end{matrix}} (- \frac{λ}{| X_{i N} |} - \sum_{j = 1}^{N - 1} \frac{X_{i j}}{X_{i N}} θ_{j} + \frac{Y_{i}}{X_{i N}}) \leq θ_{N} \leq - max_{\begin{matrix} 1 \leq i \leq M \\ X_{i N} \neq 0 \end{matrix}} (- \frac{λ}{| X_{i N} |} + \sum_{j = 1}^{N - 1} \frac{X_{i j}}{X_{i N}} θ_{j} - \frac{Y_{i}}{X_{i N}}) .

After replacing

λ

by

μ

that denotes the minimum of the objective function and using the equality

max (a, b) = - min (- a, - b)

to change from max to min in the right-hand side, the double inequality takes the form of (3).

The above double inequality defines a nonempty set of values for the unknown

θ_{N}

if, and only if, the following condition holds:

max_{\begin{matrix} 1 \leq i \leq M \\ X_{i N} \neq 0 \end{matrix}} (- \frac{λ}{| X_{i N} |} - \sum_{j = 1}^{N - 1} \frac{X_{i j}}{X_{i N}} θ_{j} + \frac{Y_{i}}{X_{i N}}) + max_{\begin{matrix} 1 \leq i \leq M \\ X_{i N} \neq 0 \end{matrix}} (- \frac{λ}{| X_{i N} |} + \sum_{j = 1}^{N - 1} \frac{X_{i j}}{X_{i N}} θ_{j} - \frac{Y_{i}}{X_{i N}}) \leq 0,

which is readily rearranged in the form of the inequality

max_{\begin{matrix} 1 \leq i, k \leq M \\ X_{i N}, X_{k N} \neq 0 \end{matrix}} (- \frac{| X_{i N} | + | X_{k N} |}{| X_{i N} | | X_{k N} |} λ - \sum_{j = 1}^{N - 1} \frac{X_{i j} X_{k N} - X_{k j} X_{i N}}{X_{i N} X_{k N}} θ_{j} + \frac{Y_{i} X_{k N} - Y_{k} X_{i N}}{X_{i N} X_{k N}}) \leq 0 .

This inequality is equivalent to the system of inequalities

- \frac{| X_{i N} | + | X_{k N} |}{| X_{i N} | | X_{k N} |} λ - \sum_{j = 1}^{N - 1} \frac{X_{i j} X_{k N} - X_{k j} X_{i N}}{X_{i N} X_{k N}} θ_{j} + \frac{Y_{i} X_{k N} - Y_{k} X_{i N}}{X_{i N} X_{k N}} \leq 0, X_{i N}, X_{k N} \neq 0; 1 \leq i, k \leq M .

By solving these inequalities for

λ

, we obtain the system

λ \geq - \frac{| X_{i N} | | X_{k N} |}{X_{i N} X_{k N}} (\sum_{j = 1}^{N - 1} \frac{X_{i j} X_{k N} - X_{k j} X_{i N}}{| X_{i N} | + | X_{k N} |} θ_{j} - \frac{Y_{i} X_{k N} - Y_{k} X_{i N}}{| X_{i N} | + | X_{k N} |}), X_{i N}, X_{k N} \neq 0; 1 \leq i, k \leq M .

We now note that interchanging the indices i and k in the differences

X_{i j} X_{k N} - X_{k j} X_{i N}, Y_{i} X_{k N} - Y_{k} X_{i N}

changes the sign of these differences, and hence, the sign of the entire right-hand side of each inequality in the system. As a result, for every pair of indices i and k, the system includes both the inequality

λ \geq - \frac{| X_{i N} | | X_{k N} |}{X_{i N} X_{k N}} (\sum_{j = 1}^{N - 1} \frac{X_{i j} X_{k N} - X_{k j} X_{i N}}{| X_{i N} | + | X_{k N} |} θ_{j} - \frac{Y_{i} X_{k N} - Y_{k} X_{i N}}{| X_{i N} | + | X_{k N} |}),

and the inequality

\begin{matrix} λ \geq - \frac{| X_{k N} | | X_{i N} |}{X_{k N} X_{i N}} (\sum_{j = 1}^{N - 1} \frac{X_{k j} X_{i N} - X_{i j} X_{k N}}{| X_{k N} | + | X_{i N} |} θ_{j} - \frac{Y_{k} X_{i N} - Y_{i} X_{k N}}{| X_{k N} | + | X_{i N} |}) \\ = \frac{| X_{i N} | | X_{k N} |}{X_{i N} X_{k N}} (\sum_{j = 1}^{N - 1} \frac{X_{i j} X_{k N} - X_{k j} X_{i N}}{| X_{i N} | + | X_{k N} |} θ_{j} - \frac{Y_{i} X_{k N} - Y_{k} X_{i N}}{| X_{i N} | + | X_{k N} |}) . \end{matrix}

After coupling the paired inequalities and considering that the equality

| X_{i N} X_{k N} | = | X_{i N} | | X_{k N} |

is valid, we rearrange the system as follows:

λ \geq |\sum_{j = 1}^{N - 1} \frac{X_{i j} X_{k N} - X_{k j} X_{i N}}{| X_{i N} | + | X_{k N} |} θ_{j} - \frac{Y_{i} X_{k N} - Y_{k} X_{i N}}{| X_{i N} | + | X_{k N} |}|, X_{i N}, X_{k N} \neq 0; 1 \leq i, k \leq M .

Furthermore, we combine the inequalities for all

1 \leq i, k \leq M

and add the last inequality at (4) to replace the condition

X_{i N}, X_{k N} \neq 0

by that in the form

| X_{i N} | + | X_{k N} | \neq 0

and rewrite the system as one inequality

λ \geq max_{\begin{matrix} 1 \leq i, k \leq M \\ | X_{i N} | + | X_{k N} | \neq 0 \end{matrix}} |\sum_{j = 1}^{N - 1} \frac{X_{i j} X_{k N} - X_{k j} X_{i N}}{| X_{i N} | + | X_{k N} |} θ_{j} - \frac{Y_{i} X_{k N} - Y_{k} X_{i N}}{| X_{i N} | + | X_{k N} |}| .

We now observe that the term under the max operator is invariant under permutation of the indices i and k, and is equal to zero if

i = k

. Therefore, we can reduce the set of indices defined as

1 \leq i, k \leq M

by that given by the condition

1 \leq i < k \leq M

and represent the lower bound on

λ

as

λ \geq max_{\begin{matrix} 1 \leq i < k \leq M \\ | X_{i N} | + | X_{k N} | \neq 0 \end{matrix}} |\sum_{j = 1}^{N - 1} \frac{X_{i j} X_{k N} - X_{k j} X_{i N}}{| X_{i N} | + | X_{k N} |} θ_{j} - \frac{Y_{i} X_{k N} - Y_{k} X_{i N}}{| X_{i N} | + | X_{k N} |}| .

Since the minimum of

λ

is bounded from below by the expression on the right-hand side, we need to find the minimum of this expression with respect to

θ_{1}, \dots, θ_{N - 1}

, which leads to solving problem (2). □

To conclude this section, we note that the reduced problem at (2) has the same general form as problem (1) with the parameter

θ_{N}

eliminated. This offers a potential for solving the problem under study by recurrent implementation of Lemma 1. We discuss application of the lemma to derive direct solutions of problems of low dimension and to develop a recursive procedure to solve problems of arbitrary dimension in what follows.

4. Solution of One- and Two-Parameter Regression Problems

We now apply the obtained result to derive direct, exact solutions to regression problems with one and two parameters. These solutions can be directly extended to problems with more parameters, which leads, however, to more complicated and bulky expressions, not presented here to save space.

We start with one-parameter simple linear regression problems, which have well-known solutions, and then find a complete solution for a two-parameter linear regression problem.

4.1. One-Parameter Linear Regression Problems

Let us suppose that, for given explanatory (independent) variables

X_{i} \in R

and response (dependent) variables

Y_{i} \in R

, where

i = 1, \dots, M

, we find the unknown regression parameter

θ \in R

that achieves the minimum

\begin{matrix} min_{θ} & max_{1 \leq i \leq M} | X_{i} θ - Y_{i} | . \end{matrix}

(5)

To solve the problem, we directly apply Lemma 1 with

N = 1

. Elimination of the empty sums in (2) and (3), and substitution

X_{i 1} = X_{i}

for all

i = 1, \dots, M

and

θ_{1} = θ

yield the next results.

Proposition 1.

The minimum error in problem (5) is equal to

μ = max_{\begin{matrix} 1 \leq i < k \leq M \\ | X_{i} | + | X_{k} | \neq 0 \end{matrix}} \frac{| Y_{i} X_{k} - Y_{k} X_{i} |}{| X_{i} | + | X_{k} |},

and all solutions of the problem are given by the condition

max_{\begin{matrix} 1 \leq i \leq M \\ X_{i} \neq 0 \end{matrix}} (- \frac{μ}{| X_{i} |} + \frac{Y_{i}}{X_{i}}) \leq θ_{} \leq min_{\begin{matrix} 1 \leq i \leq M \\ X_{i} \neq 0 \end{matrix}} (\frac{μ}{| X_{i} |} + \frac{Y_{i}}{X_{i}}) .

We now consider a special case of problem (5) in the form

\begin{matrix} min_{θ} & max_{1 \leq i \leq M} | θ - Y_{i} | . \end{matrix}

(6)

To handle the problem, we set

X_{i} = 1

for all

i = 1, \dots, M

in the expressions obtained in Proposition 1. Since

| X_{i} | + | X_{k} | = 2 \neq 0

, the minimum error in problem (6) becomes

μ = max_{1 \leq i < k \leq M} | Y_{i} - Y_{k} | / 2 = max_{1 \leq i, k \leq M} | Y_{i} - Y_{k} | / 2 = max_{1 \leq i \leq M} Y_{i} / 2 - min_{1 \leq i \leq M} Y_{i} / 2 .

The solution

θ

is given by the condition

- μ + max_{1 \leq i \leq M} Y_{i} \leq θ \leq μ + min_{1 \leq i \leq M} Y_{i},

which, after substitution of the above expression for

μ

, leads to the unique result

θ = max_{1 \leq i \leq M} Y_{i} / 2 + min_{1 \leq i \leq M} Y_{i} / 2 .

4.2. Two-Parameter Linear Regression Problem

We now turn to two-parameter problems, which can be solved by twofold application of Lemma 1. To avoid cumbersome calculations, we concentrate on a special case in which, given variables

X_{i}, Y_{i} \in R

for all

i = 1, \dots, M

, our aim is to find the parameters

θ_{1}, θ_{2} \in R

to achieve

\begin{matrix} min_{θ_{1}, θ_{2}} & max_{1 \leq i \leq M} | θ_{1} + X_{i} θ_{2} - Y_{i} | . \end{matrix}

(7)

Proposition 2.

The minimum error in problem (7) is equal to

μ = max_{\begin{matrix} 1 \leq i < k \leq M \\ | X_{i} | + | X_{k} | \neq 0 \end{matrix}} max_{\begin{matrix} 1 \leq p < r \leq M \\ X_{r} \neq X_{p} \end{matrix}} \frac{| (Y_{i} X_{k} - Y_{k} X_{i}) (X_{r} - X_{p}) - (Y_{p} X_{r} - Y_{r} X_{p}) (X_{k} - X_{i}) |}{(| X_{i} | + | X_{k} |) | X_{r} - X_{p} | + (| X_{p} | + | X_{r} |) | X_{k} - X_{i} |},

(8)

and all solutions of the problem are given by the conditions

\begin{matrix} max_{\begin{matrix} 1 \leq i < k \leq M \\ X_{k} \neq X_{i} \end{matrix}} (- \frac{| X_{i} | + | X_{k} |}{| X_{k} - X_{i} |} μ + \frac{Y_{i} X_{k} - Y_{k} X_{i}}{X_{k} - X_{i}}) & \leq θ_{1} \leq min_{\begin{matrix} 1 \leq i < k \leq M \\ X_{k} \neq X_{i} \end{matrix}} (\frac{| X_{i} | + | X_{k} |}{| X_{k} - X_{i} |} μ + \frac{Y_{i} X_{k} - Y_{k} X_{i}}{X_{k} - X_{i}}), \end{matrix}

(9)

\begin{matrix} max_{\begin{matrix} 1 \leq i \leq M \\ X_{i} \neq 0 \end{matrix}} (- \frac{μ}{| X_{i} |} - \frac{θ_{1}}{X_{i}} + \frac{Y_{i}}{X_{i}}) & \leq θ_{2} \leq min_{\begin{matrix} 1 \leq i \leq M \\ X_{i} \neq 0 \end{matrix}} (\frac{μ}{| X_{i} |} - \frac{θ_{1}}{X_{i}} + \frac{Y_{i}}{X_{i}}) . \end{matrix}

(10)

Proof.

We apply Lemma 1 with

N = 2

, where we take

X_{i 1} = 1

and

X_{i 2} = X_{i}

for all

i = 1, \dots, M

. As a result, problem (7) reduces to the one-parameter problem

\begin{matrix} min_{θ_{1}} & max_{\begin{matrix} 1 \leq i < k \leq M \\ | X_{i} | + | X_{k} | \neq 0 \end{matrix}} |\frac{X_{k} - X_{i}}{| X_{i} | + | X_{k} |} θ_{1} - \frac{Y_{i} X_{k} - Y_{k} X_{i}}{| X_{i} | + | X_{k} |}| \end{matrix}

and box constraint for

θ_{2}

in the form of the double inequality at (10), where

μ

is the minimum in the one-parameter problem.

We note that the objective function in the problem does not change if we replace the condition

1 \leq i < k \leq M

for indices over which the maximum is taken, by the extended condition

1 \leq i, k \leq M

.

In a similar way as in Lemma 1, we first represent the one-parameter problem under consideration as

\begin{matrix} min_{θ_{1}} & λ, \\ s . t . & max_{\begin{matrix} 1 \leq i, k \leq M \\ | X_{i} | + | X_{k} | \neq 0 \end{matrix}} |\frac{X_{k} - X_{i}}{| X_{i} | + | X_{k} |} θ_{1} - \frac{Y_{i} X_{k} - Y_{k} X_{i}}{| X_{i} | + | X_{k} |}| \leq λ . \end{matrix}

The inequality constraint in this problem is equivalent to the system of inequalities

\begin{matrix} λ & \geq - \frac{X_{k} - X_{i}}{| X_{i} | + | X_{k} |} θ_{1} + \frac{Y_{i} X_{k} - Y_{k} X_{i}}{| X_{i} | + | X_{k} |}, \\ λ & \geq \frac{X_{k} - X_{i}}{| X_{i} | + | X_{k} |} θ_{1} - \frac{Y_{i} X_{k} - Y_{k} X_{i}}{| X_{i} | + | X_{k} |}, | X_{i} | + | X_{k} | \neq 0; 1 \leq i, k \leq M . \end{matrix}

After solving the inequalities for

θ_{1}

, we rewrite the system as

\begin{matrix} - θ_{1} & \geq - \frac{| X_{i} | + | X_{k} |}{| X_{k} - X_{i} |} λ - \frac{Y_{i} X_{k} - Y_{k} X_{i}}{X_{k} - X_{i}}, \\ θ_{1} & \geq - \frac{| X_{i} | + | X_{k} |}{| X_{k} - X_{i} |} λ + \frac{Y_{i} X_{k} - Y_{k} X_{i}}{X_{i 1} X_{k} - X_{k 1} X_{i}}, X_{k} \neq X_{i}; \\ λ & \geq \frac{| Y_{i} X_{k} - Y_{k} X_{i} |}{| X_{i} | + | X_{k} |}, X_{k} = X_{i}; | X_{i} | + | X_{k} | \neq 0; 1 \leq i, k \leq M . \end{matrix}

By combining these inequalities, we obtain

\begin{matrix} - θ_{1} & \geq max_{\begin{matrix} 1 \leq i, k \leq M \\ | X_{i} | + | X_{k} | \neq 0 \\ X_{k} \neq X_{i} \end{matrix}} (- \frac{| X_{i} | + | X_{k} |}{| X_{k} - X_{i} |} λ - \frac{Y_{i} X_{k} - Y_{k} X_{i}}{X_{k} - X_{i}}), \\ θ_{1} & \geq max_{\begin{matrix} 1 \leq i, k \leq M \\ | X_{i} | + | X_{k} | \neq 0 \\ X_{k} \neq X_{i} \end{matrix}} (- \frac{| X_{i} | + | X_{k} |}{| X_{k} - X_{i} |} λ + \frac{Y_{i} X_{k} - Y_{k} X_{i}}{X_{k} - X_{i}}), \\ λ & \geq max_{\begin{matrix} 1 \leq i, k \leq M \\ | X_{i} | + | X_{k} | \neq 0 \\ X_{k} = X_{i} \end{matrix}} \frac{| Y_{i} X_{k} - Y_{k} X_{i} |}{| X_{i} | + | X_{k} |} = max_{\begin{matrix} 1 \leq i, k \leq M \\ | X_{i} | \neq 0 \end{matrix}} \frac{| Y_{i} - Y_{k} |}{2} . \end{matrix}

(11)

The first two inequalities yield the double inequality

max_{\begin{matrix} 1 \leq i, k \leq M \\ | X_{i} | + | X_{k} | \neq 0 \\ X_{k} \neq X_{i} \end{matrix}} (- \frac{| X_{i} | + | X_{k} |}{| X_{k} - X_{i} |} λ + \frac{Y_{i} X_{k} - Y_{k} X_{i}}{X_{k} - X_{i}}) \leq θ_{1} \leq min_{\begin{matrix} 1 \leq i, k \leq M \\ | X_{i} | + | X_{k} | \neq 0 \\ X_{k} \neq X_{i} \end{matrix}} (\frac{| X_{i} | + | X_{k} |}{| X_{k} - X_{i} |} λ + \frac{Y_{i} X_{k} - Y_{k} X_{i}}{X_{k} - X_{i}}) .

Since, under the condition

X_{k} \neq X_{i}

, the condition

| X_{i} | + | X_{k} | \neq 0

holds as well, we exclude the latter one. Observing that the terms under the max and min operators are invariant under permutation of i and k, we adjust the condition on indices to write the box constraint for

θ_{1}

as (9).

The feasibility condition for the box constraint to be valid for

θ_{1}

takes the form of the inequality

max_{\begin{matrix} 1 \leq i, k \leq M \\ X_{k} \neq X_{i} \end{matrix}} max_{\begin{matrix} 1 \leq p, r \leq M \\ X_{r} \neq X_{p} \end{matrix}} (- (\frac{| X_{i} | + | X_{k} |}{| X_{k} - X_{i} |} + \frac{| X_{p} | + | X_{r} |}{| X_{r} - X_{p} |}) λ + \frac{Y_{i} X_{k} - Y_{k} X_{i}}{X_{k} - X_{i}} - \frac{Y_{p} X_{r} - Y_{r} X_{p}}{X_{r} - X_{p}}) \leq 0 .

As before, we represent this inequality as the system of inequalities for each

i, k

and

p, r

, and then solve these inequalities for

λ

. After combining the solutions back into one inequality and adding the last inequality at (11), we obtain

λ \geq max_{\begin{matrix} 1 \leq i, k \leq M \\ | X_{i} | + | X_{k} | \neq 0 \end{matrix}} max_{\begin{matrix} 1 \leq p, r \leq M \\ X_{r} \neq X_{p} \end{matrix}} \frac{| (Y_{i} X_{k} - Y_{k} X_{i}) (X_{r} - X_{p}) - (Y_{p} X_{r} - Y_{r} X_{p}) (X_{k} - X_{i}) |}{(| X_{i} | + | X_{k} |) | X_{r} - X_{p} | + (| X_{p} | + | X_{r} |) | X_{k} - X_{i} |},

where the expression on the right-hand side determines the minimum

μ

.

Finally, we note that the fractional term under maximization is invariant with respect to interchanging the indices i and k, as well as p and r. Therefore, we can replace the conditions

1 \leq i, k \leq M

and

1 \leq p, r \leq M

by the conditions

1 \leq i < k \leq M

and

1 \leq p < r \leq M

, which yields the representation for

μ

in the form of (8). □

5. General Solution Procedure

We now use Lemma 1 to derive a complete solution of problem (1) by performing a series of reduction steps, each eliminating an unknown parameter in the problem and determining a box constraint for this parameter. We observe that the elimination of a parameter from the objective function as described in Lemma 1 leaves the general form of the problem unchanged. Therefore, we can repeat the elimination over and over again until the function has no more parameters, and thus becomes a constant that shows the minimum of the objective function in the initial problem.

At the same time, together with the elimination of a parameter from the objective function, Lemma 1 offers a box constraint for this parameter, represented in terms of those parameters which are retained in the function. We see that the last constraint does not depend on any other parameters, and thus is directly given by a double inequality with constant values on both sides. As a result, we can take the box constraints in the order from the last constraint to the first, which yields a system of double inequalities that completely determines the solution set of the problem.

We are in a position to describe the solution procedure formally in more detail. The procedure follows a direct solution method that examines the unknown parameters in reversal order, starting from the parameter

θ_{N}

and going backward to the parameter

θ_{1}

. Let n be the number of parameters in the objective function in the current step of the procedure.

Initially, we take

n = N

and set

M_{n} = M

. For all

i = 1, \dots, M_{n}

and

j = 1, \dots, n

, we define

X_{i j}^{n} = X_{i j}, Y_{i}^{n} = Y_{i} .

We also introduce the matrix-vector notation

X_{n} = (\begin{matrix} X_{11}^{n} & \dots & X_{1 n}^{n} \\ ⋮ & ⋱ & ⋮ \\ X_{M_{n}, 1}^{n} & \dots & X_{M_{n}, n}^{n} \end{matrix}), Y_{n} = (\begin{matrix} Y_{1}^{n} \\ ⋮ \\ Y_{M_{n}}^{n} \end{matrix}), θ_{n} = (\begin{matrix} θ_{1} \\ ⋮ \\ θ_{n} \end{matrix}) .

For each

n = N, N - 1, \dots, 1

, the procedure produces a two-fold outcome: the reduction of the current problem by eliminating an unknown parameter, and the derivation of a box constraint for the eliminated parameter.

5.1. Elimination of Parameters

Assuming that the norm sign in what follows stands for the Chebyshev norm, we start with eliminating the parameter

θ_{n}

from the problem

\begin{matrix} min_{θ_{1}, \dots, θ_{n}} & max_{1 \leq i \leq M_{n}} |\sum_{j = 1}^{n} X_{i j}^{n} θ_{j} - Y_{i}^{n}| = min_{θ_{n}} & ∥ X_{n} θ_{n} - Y_{n} ∥ . \end{matrix}

It follows from Lemma 1 that, as a result of this elimination, the problem reduces, if

n > 1

, to the problem

\begin{matrix} min_{θ_{1}, \dots, θ_{n - 1}} & max_{1 \leq i \leq M_{n - 1}} |\sum_{j = 1}^{n - 1} X_{i j}^{n - 1} θ_{j} - Y_{i}^{n - 1}| = min_{θ_{n - 1}} & ∥ X_{n - 1} θ_{n - 1} - Y_{n - 1} ∥, \end{matrix}

or degenerates, if

n = 1

, to the constant

max_{1 \leq i \leq M_{0}} |Y_{i}^{0}| = ∥ Y_{0} ∥ .

We now exploit the representation of problem (2) to establish formulas of recalculating the objective function when changing to the reduced problem.

First, we consider the condition for indices in (2), which takes the form

1 \leq i < k \leq M_{n}

. We assume that the pairs of indices

(i, k)

defined by the condition are listed in the order of the sequence

(1, 2), \dots, (1, M_{n}), (2, 3), \dots, (2, M_{n}), \dots, (M_{n} - 1, M_{n}) .

It is not difficult to verify by direct substitution that each fixed pair

(i, k)

in this sequence has the number (index) calculated as

M_{n} (i - 1) - i (i - 1) / 2 + k - 1 .

Furthermore, we use (2) to define, for all i and k such that

1 \leq i < k \leq M_{n}

and for all

j = 1, \dots, n - 1

, the recurrent formulas

\begin{matrix} X_{M_{n} (i - 1) - i (i - 1) / 2 + k - 1, j}^{n - 1} & = \{\begin{matrix} \frac{X_{i j}^{n} X_{k n}^{n} - X_{k j}^{n} X_{i n}^{n}}{| X_{i n}^{n} | + | X_{k n}^{n} |}, & if | X_{i n}^{n} | + | X_{k n}^{n} | \neq 0; \\ 0 & otherwise; \end{matrix} \\ Y_{M_{n} (i - 1) - i (i - 1) / 2 + k - 1}^{n - 1} & = \{\begin{matrix} \frac{Y_{i}^{n} X_{k n}^{n} - Y_{k}^{n} X_{i n}^{n}}{| X_{i n}^{n} | + | X_{k n}^{n} |}, & if | X_{i n}^{n} | + | X_{k n}^{n} | \neq 0; \\ 0 & otherwise . \end{matrix} \end{matrix}

(12)

Note that if

| X_{i n}^{n} | + | X_{k n}^{n} | = 0

, then the above formulas produce a row of zeros that corresponds to a zero term, which does not contribute to the objective function. We assume that all such zero rows are removed, and the rest of the rows are renumbered (reindexed) to preserve continual enumeration.

We denote the number of nonzero rows by

M_{n - 1}

and observe that

M_{n - 1} \leq M_{n} (M_{n} - 1) / 2 .

Finally, we take the numbers

X_{i j}^{n - 1}

and

Y_{i}^{n - 1}

with

i = 1, \dots, M_{n - 1}

and

j = 1, \dots, n - 1

to form the matrix and vector

X_{n - 1} = (X_{i j}^{n - 1}), Y_{n - 1} = (Y_{i}^{n - 1}),

which completely determine the objective function in the reduced problem. Specifically, the reduced problem for

n = 1

degenerates into the constant

μ = ∥ Y_{0} ∥,

(13)

representing the minimum of the objective function in the initial problem.

5.2. Derivation of Box Constraints

We take the double inequality at (3) and adjust it to write the box constraint for the parameter

θ_{n}

in the form

max_{\begin{matrix} 1 \leq i \leq M_{n} \\ X_{i n}^{n} \neq 0 \end{matrix}} (- \frac{μ}{| X_{i n}^{n} |} - \sum_{j = 1}^{n - 1} \frac{X_{i j}^{n}}{X_{i n}^{n}} θ_{j} + \frac{Y_{i}^{n}}{X_{i n}^{n}}) \leq θ_{n} \leq min_{\begin{matrix} 1 \leq i \leq M_{n} \\ X_{i n}^{n} \neq 0 \end{matrix}} (\frac{μ}{| X_{i n}^{n} |} - \sum_{j = 1}^{n - 1} \frac{X_{i j}^{n}}{X_{i n}^{n}} θ_{j} + \frac{Y_{i}^{n}}{X_{i n}^{n}}),

where

μ

denotes the minimum value of the objective function in problem (1).

To represent this inequality constraint in vector form, we introduce the following notation. First, for all

i = 1, \dots, M_{n}

such that

X_{i n}^{n} \neq 0

and all

j = 1, \dots, n - 1

, we define

T_{i j}^{n} = \frac{X_{i j}^{n}}{X_{i n}^{n}}, L_{i}^{n} = \frac{Y_{i}^{n}}{X_{i n}^{n}} - \frac{μ}{| X_{i n}^{n} |}, U_{i}^{n} = \frac{Y_{i}^{n}}{X_{i n}^{n}} + \frac{μ}{| X_{i n}^{n} |} .

(14)

We note that all indices i with

X_{i n}^{n} = 0

are not taken into account when calculating the maximum and minimum in the double inequality, and hence are excluded from the index set

1, \dots, M_{n}

. Assuming that the rest of indices are renumbered to preserve continual enumeration, we introduce the matrix and column vectors

T_{n} = (T_{i j}^{n}), L_{n} = (L_{i}^{n}), U_{n} = (U_{i}^{n}) .

With this matrix-vector notation, we write the box constraint, if

n > 1

, as the double inequality

max (L_{n} - T_{n} θ_{n - 1}) \leq θ_{n} \leq min (U_{n} - T_{n} θ_{n - 1}),

(15)

and, if

n = 1

, as the inequality

max (L_{1}) \leq θ_{1} \leq min (U_{1}),

(16)

where max and min symbols are thought of as operators that calculate the maximum and minimum over all elements of corresponding vectors.

5.3. Solution Algorithm

We summarize the above consideration in the form of a computational algorithm to solve problem (1) in a finite number of steps. The algorithm includes two sequential phases: backward elimination and forward determination (substitution) of the unknown parameters.

The backward elimination starts with

n = N

by setting

M_{n} = M

and

X_{n} = X, Y_{n} = Y .

Furthermore, for each

n = N, \dots, 1

, the matrix

X_{n}

and vector

Y_{n}

are used as described above to obtain the values of

X_{n - 1}

and

Y_{n - 1}

if

n > 1

or the value of

Y_{0}

if

n = 1

. As supplementary results, the matrices

T_{n}

are also evaluated from the matrices

X_{n}

.

The backward elimination completes at

n = 1

by calculating the minimum value of the objective function, given by

μ = ∥ Y_{0} ∥

.

The forward determination first uses the obtained minimum

μ

, matrix

X_{1}

and vector

Y_{1}

to calculate the vectors

L_{1}

and

U_{1}

and then evaluate the box constraint for the unknown

θ_{1}

in the form of double inequality at (16). Then, for each

n = 2, \dots, N

, the vectors

L_{n}

and

U_{n}

are calculated from

μ

,

X_{n}

and

Y_{n}

to represent the box constraints for

θ_{n}

as in (15).

Note that the bounds in the box constraint for the parameter

θ_{1}

are explicitly defined by constants, whereas the bounds for each parameter

θ_{n}

with

n > 1

are defined as linear functions of the previous parameters

θ_{1}, \dots, θ_{n - 1}

. Therefore, we can first fix a value to satisfy the box constraint for

θ_{1}

and then substitute this value into the box constraint for

θ_{2}

to obtain explicit bounds given by constants. By repeating such calculations to fix a value for a parameter with explicit bounds and to evaluate bounds for the next parameter, we can successively determine a solution of the problem.

5.4. Computational Complexity

The most computationally intensive and memory demanding component of the algorithm, which determines the overall rate of computational complexity and memory requirement, is the calculation of entries in the matrices

X_{n}

and vectors

Y_{n}

for all

n = N - 1, N - 2, \dots, 1

, and vector

Y_{0}

by using (12). Though calculating one entry involves a few simple operations, the number of all entries grows very fast as M and N increase.

To derive a rough estimate for the number of entries in all matrices, we first evaluate the number of rows in each matrix. Assuming that the matrix

X_{N} = X

has M rows, we see that the number of rows in

X_{N - 1}

is bounded from above by

M_{N - 1} \leq M_{N} (M_{N} - 1) / 2 = M (M - 1) / 2 < M^{2} / 2 = 2 {(M / 2)}^{2} .

Recursive application of this estimate yields an upper bound for the number of rows in the matrices

X_{N - l}

for each

l = 1, \dots, N - 1

in the form

M_{N - l} \leq M_{N - l + 1} (M_{N - l + 1} - 1) / 2 < M_{N - l + 1}^{2} / 2 < 2 {(M / 2)}^{2^{l}} .

At the last step with

l = N

, we calculate the vector

Y_{0}

, in which the number of entries is no more than

M_{0} \leq M_{1} (M_{1} - 1) / 2 < M_{1}^{2} / 2 < 2 {(M / 2)}^{2^{N}} .

Since we have

n + 1

columns in the matrix

X_{n}

together with the vector

Y_{n}

, the overall number of the entries in all steps can be estimated as

\sum_{l = 1}^{N} (N - l + 1) M_{N - l} < \sum_{l = 1}^{N} 2 (N - l + 1) {(M / 2)}^{2^{l}} .

We denote the upper bound on the right-hand side by

C (N, M)

and observe that this bound increases polynomially with respect to M, and double exponentially with N. For problems with few parameters, the value of

C (N, M)

seems to be rather acceptable. Specifically, in the three-parameter case with

N = 3

and

M = 10, 20, 50

, we have

C (3, 10) = 783, 900

,

C (3, 20) = 200, 040, 600

, and

C (3, 50) = 305, 177, 347, 500

. A further increase of the number of parameters N results in a rapid rise in the value of

C (N, M)

, as the following examples show:

C (4, 10) = 305, 177, 347, 700

,

C (5, 10) = 610, 353, 911, 500

.

Note that the actual number of entries to calculate is fewer than that given by the bound

C (N, M)

. As it follows from (12), this number can be further reduced if the matrices

X_{n}

, or at least the initial matrix

X_{N} = X

, have many zero entries (sparse matrices). It is clear that, if a matrix has zero entries in a column other than the last one, the columns (and related parameters) can be renumbered to put the column with zero entries on the last place where zero entries can reduce computations. As an example of the case with good chances of having sparse matrices

X_{n}

, one can consider problems where the initial matrix

X

has entries that take values from the set

{- 1, 0, 1}

.

Finally, we observe that the calculations by Formula (12) can be performed for different entries quite independently, which offers strong potential for parallel implementation of the procedure on parallel and vector computers to provide more computational and memory resources, and hence to extend applicability to problems of higher dimensions.

6. Software Implementation and Numerical Examples

We conclude with comments on software implementation of the solution procedure and illustrative numerical examples of low dimensions that demonstrate the computational technique involved in the solution. For the sake of illustration, we concentrate on application to problems with exact input data given by integer (rational) numbers to find explicit rational solutions.

To obtain exact solutions, the procedure has been coded for serial symbolic computations in the MATLAB (Release R2020b) environment as a collection of functions that calculate all intermediate matrices and vectors, as well as provide the overall functionality of the algorithm. The numerical experiments were conducted on a custom computer with a 4-core 8-thread Intel Xeon E3-1231 v3 CPU at 3.40GHz and 32GB of DDR3 RAM, running Windows 10 Enterprise 64-bit OS.

Example 1.

Let us take

N = 3

and

M = 4

and consider the approximation (regression) problem with

X = (\begin{matrix} 3 & - 1 & 2 \\ - 1 & - 2 & 2 \\ - 2 & 3 & - 1 \\ 0 & 2 & - 1 \end{matrix}), Y = (\begin{matrix} 2 \\ 1 \\ - 1 \\ 0 \end{matrix}) .

We start with the backward elimination phase by setting

X_{3} = X

and

Y_{3} = Y

. We use (12) to calculate the entries in

X_{2} = (\begin{matrix} 2 & 1 / 2 \\ 1 / 3 & - 5 / 3 \\ - 1 & - 1 \\ 5 / 3 & - 4 / 3 \\ 1 / 3 & - 2 / 3 \\ 1 & - 1 / 2 \end{matrix}), Y_{2} = (\begin{matrix} 1 / 2 \\ 0 \\ - 2 / 3 \\ 1 / 3 \\ - 1 / 3 \\ 1 / 2 \end{matrix}),

and then the entries in

X_{1} = (\begin{matrix} - 21 / 13 \\ - 1 \\ - 21 / 11 \\ - 9 / 7 \\ - 3 / 2 \\ - 3 / 4 \\ 7 / 9 \\ 1 / 7 \\ 9 / 13 \\ 9 / 7 \\ 3 / 5 \\ 1 \\ - 1 / 3 \\ 3 / 11 \\ 3 / 7 \end{matrix}), Y_{1} = (\begin{matrix} - 5 / 13 \\ - 1 / 9 \\ - 5 / 11 \\ - 1 / 7 \\ - 1 / 2 \\ - 5 / 12 \\ 5 / 27 \\ - 5 / 21 \\ 5 / 13 \\ 11 / 21 \\ 1 / 15 \\ 5 / 9 \\ - 1 / 3 \\ 3 / 11 \\ 3 / 7 \end{matrix}) .

Next, we obtain the vector

Y_{0}

, which appears to have 105 entries and thus is not shown here to save space. Evaluating the maximum entry of

Y_{0}

as the minimum of the objective function according to (13) yields

μ = 2 / 7 .

In parallel with evaluating the entries of

X_{n}

and

Y_{n}

, we apply (14) to find the entries in

T_{3} = (\begin{matrix} - 3 / 2 & 1 / 2 \\ 1 / 2 & 1 \\ - 2 & 3 \\ 0 & 2 \end{matrix}), T_{2} = (\begin{matrix} - 4 \\ 1 / 5 \\ - 1 \\ 5 / 4 \\ 1 / 2 \\ 2 \end{matrix}) .

Substitution of the obtained minimum

μ = 2 / 7

yields

L_{3} = (\begin{matrix} 6 / 7 \\ 5 / 14 \\ 5 / 7 \\ - 2 / 7 \end{matrix}), U_{3} = (\begin{matrix} 8 / 7 \\ 9 / 14 \\ 9 / 7 \\ 2 / 7 \end{matrix}), L_{2} = (\begin{matrix} 3 / 7 \\ - 6 / 35 \\ 8 / 21 \\ - 13 / 28 \\ 1 / 14 \\ - 11 / 7 \end{matrix}), U_{2} = (\begin{matrix} 11 / 7 \\ 6 / 35 \\ 20 / 21 \\ - 1 / 28 \\ 13 / 14 \\ - 3 / 7 \end{matrix}) .

Finally, we calculate the vectors

L_{1} = (\begin{matrix} 3 / 49 \\ - 11 / 63 \\ 13 / 147 \\ - 1 / 9 \\ 1 / 7 \\ 11 / 63 \\ - 19 / 147 \\ - 11 / 3 \\ 1 / 7 \\ 5 / 27 \\ - 23 / 63 \\ 17 / 63 \\ 1 / 7 \\ - 1 / 21 \\ 1 / 3 \end{matrix}), U_{1} = (\begin{matrix} 61 / 147 \\ 25 / 63 \\ 19 / 49 \\ 1 / 3 \\ 11 / 21 \\ 59 / 63 \\ 89 / 147 \\ 1 / 3 \\ 61 / 63 \\ 17 / 27 \\ 37 / 63 \\ 53 / 63 \\ 13 / 7 \\ 43 / 21 \\ 5 / 3 \end{matrix}) .

The forward determination (substitution) phase of the procedure involves application of (16) and then (15) to obtain the unique solution

θ_{1} = 1 / 3, θ_{2} = 5 / 21, θ_{3} = 16 / 21 .

The actual number of entries in the matrices

X_{n}

and vectors

Y_{n}

to calculate in the problem is 153, which is fewer than the upper bound given by

C (3, 4) = 600

. The computer time to obtain the exact solution by symbolic computation averages

2.3

s.

Example 2.

Consider the problem with

N = 3

and

M = 10

, where the input data are given by

X = (\begin{matrix} 3 & - 1 & 2 \\ - 1 & - 2 & 2 \\ - 2 & 3 & - 1 \\ 0 & 2 & - 1 \\ 1 & 2 & - 1 \\ 3 & 1 & 0 \\ 1 & 1 & - 1 \\ - 1 & - 1 & 2 \\ 0 & 3 & 1 \\ 2 & 1 & 0 \end{matrix}), Y = (\begin{matrix} 2 \\ 1 \\ - 1 \\ 0 \\ 1 \\ - 1 \\ 2 \\ 0 \\ 1 \\ 3 \end{matrix}) .

The application of the procedure yields the unique solution

θ_{1} = 1 / 9, θ_{2} = 13 / 18, θ_{3} = 8 / 9 .

The matrices

X_{n}

and vectors

Y_{n}

involved in calculations have

444, 280

entries (while the corresponding upper bound is

C (3, 10) = 783, 900

). The symbolic computations take about 52 min of computer time.

7. Conclusions

A direct computational technique has been proposed to solve discrete linear Chebyshev approximation problems, which find wide application in various areas, including the least maximum absolute deviation regression in statistics. First, we have shown that the problem under consideration can be reduced by eliminating an unknown parameter to a problem with less number of unknowns and a box constraint for the parameter eliminated. This result was used to obtain direct solutions to linear regression problems with one and two parameters.

To solve approximation problems of arbitrary dimension, we have developed a direct solution procedure that consists of two parts: backward elimination and forward substitution of the unknown parameters. The direct solution is of particular interest in the problems when analytical solutions are desired, whereas the use of iterative algorithms may be inappropriate or inadequate. We have estimated the computational complexity of the procedure, discussed its MATLAB implementation intended to provide exact solutions by symbolic computations, and presented numerical examples.

Possible lines of further investigation can include modification and improvement of the algorithm to reduce computational complexity in solving problems in both exact (rational) and inexact (floating point) form. The development of parallel implementations of the algorithm to speed up calculations is also of interest.

Funding

This work was supported in part by the Russian Foundation for Basic Research grant number 20-010-00145.

Acknowledgments

The author is very grateful to three referees for their valuable comments and suggestions, which have been incorporated into the revised version of the manuscript.

Conflicts of Interest

The author declares no conflict of interest.

References

de Laplace, P.S. Mécanique Céleste; Hillard, Gray, Littlè, and Wilkins: Boston, MA, USA, 1832; Volume 2, (Engl. Transl. with Comment. by N. Bowditch). [Google Scholar]
Harter, H.L. The method of least squares and some alternatives: Part I. Int. Stat. Rev. 1974, 42, 147–174. [Google Scholar] [CrossRef]
Steffens, K.G. The History of Approximation Theory; Birkhäuser: Boston, MA, USA, 2006. [Google Scholar] [CrossRef]
Tewarson, R.P. On minimax solutions of linear equations. Comput. J. 1972, 15, 277–279. [Google Scholar] [CrossRef]
Appa, G.; Smith, C. On L₁ and Chebyshev estimation. Math. Program. 1973, 5, 73–87. [Google Scholar] [CrossRef]
Pĭnar, M.Ç. Overdetermined systems of linear equations. In Encyclopedia of Optimization, 2nd ed.; Floudas, C.A., Pardalos, P.M., Eds.; Springer: Boston, MA, USA, 2009; pp. 2893–2896. [Google Scholar] [CrossRef]
Rabinowitz, P. Applications of linear programming to numerical analysis. SIAM Rev. 1968, 10, 121–159. [Google Scholar] [CrossRef]
Hand, M.L. Aspects of Linear Regression Estimation Under the Criterion of Minimizing the Maximum Absolute Residual. Ph.D. Thesis, Iowa State University, Ames, IA, USA, 1978. [Google Scholar]
Kennedy, W.J.; Gentle, J.E. Statistical Computing; Statistics: Textbooks and Monographs; Dekker: New York, NY, USA, 1980; Volume 33. [Google Scholar]
James, F. Fitting tracks in wire chambers using the Chebyshev norm instead of least squares. Nucl. Instr. Meth. 1983, 211, 145–152. [Google Scholar] [CrossRef]
Bertsch, G.F.; Sabbey, B.; Uusnäkki, M. Fitting theories of nuclear binding energies. Phys. Rev. C 2005, 71, 054311. [Google Scholar] [CrossRef]
Mäkilä, P.M. Robust identification and Galois sequences. Internat. J. Control 1991, 54, 1189–1200. [Google Scholar] [CrossRef]
Akçay, H.; Hjalmarsson, H.; Ljung, L. On the choice of norms in system identification. IEEE Trans. Automat. Control 1996, 41, 1367–1372. [Google Scholar] [CrossRef]
Schölkopf, B.; Smola, A.J. Learning with Kernels; Adaptive Computation and Machine Learning; MIT Press: Cambridge, MA, USA, 2002; p. 626. [Google Scholar]
Bartoszuk, M.; Beliakov, G.; Gagolewski, M.; James, S. Fitting aggregation functions to data: Part I—Linearization and regularization. In Information Processing and Management of Uncertainty in Knowledge-Based Systems; Carvalho, J.P., Lesot, M.J., Kaymak, U., Vieira, S., Bouchon-Meunier, B., Yager, R.R., Eds.; Springer: Cham, Switzerland, 2016; Volume 611, pp. 767–779. [Google Scholar] [CrossRef]
Jaschke, S.R. Arbitrage bounds for the term structure of interest rates. Finance Stoch. 1997, 2, 29–40. [Google Scholar] [CrossRef]
Harter, H.L. The method of least squares and some alternatives: Part III. Int. Stat. Rev. 1975, 43, 1–44. [Google Scholar] [CrossRef]
Harter, H.L. The method of least squares and some alternatives: Part IV. Int. Stat. Rev. 1975, 43, 125–190. [Google Scholar] [CrossRef]
Späth, H. Mathematical Algorithms for Linear Regression; Computer Science and Scientific Computing; Academic Press, Inc.: San Diego, CA, USA, 1992; p. 338. [Google Scholar]
Wagner, H.M. Linear programming techniques for regression analysis. J. Am. Statist. Assoc. 1959, 54, 206–212. [Google Scholar] [CrossRef]
Stiefel, E. Note on Jordan elimination, linear programming and Tchebycheff approximation. Numer. Math. 1960, 2, 1–17. [Google Scholar] [CrossRef]
Watson, G.A. On the best linear one-sided Chebyshev approximation. J. Approx. Theory 1973, 7, 48–58. [Google Scholar] [CrossRef]
Sposito, V.A. Minimizing the maximum absolute deviation. ACM SIGMAP Bull. 1976, 51–53. [Google Scholar] [CrossRef]
Armstrong, R.D.; Kung, D.S. Algorithm AS 135: Min-max estimates for a linear multiple regression problem. J. R. Stat. Soc. Ser. C. Appl. Stat. 1979, 28, 93–100. [Google Scholar] [CrossRef]
Kim, B.Y. Algorithm for the constrained Chebyshev estimation in linear regression. Commun. Stat. Appl. Methods 2000, 7, 47–54. [Google Scholar]
Boyd, S.; Vandenberghe, L. Convex Optimization; Cambridge University Press: Cambridge, UK, 2004. [Google Scholar]
Castillo, E.; Mínguez, R.; Castillo, C.; Cofiño, A.S. Dealing with the multiplicity of solutions of the ℓ₁ and ℓ_∞ regression models. Eur. J. Oper. Res. 2008, 188, 460–484. [Google Scholar] [CrossRef]
Ene, A.; Vladu, A. Improved convergence for ℓ₁ and ℓ_∞ regression via iteratively reweighted least squares. In Proceedings of the 36th International Conference on Machine Learning Research, Long Beach, CA, USA, 10–15 June 2019; Chaudhuri, K., Salakhutdinov, R., Eds.; Volume 97, pp. 1794–1801. [Google Scholar]
Krivulin, N. Complete solution of a constrained tropical optimization problem with application to location analysis. In Relational and Algebraic Methods in Computer Science; Lecture Notes in Computer Science; Höfner, P., Jipsen, P., Kahl, W., Müller, M.E., Eds.; Springer: Cham, Switzerland, 2014; Volume 8428, pp. 362–378. [Google Scholar] [CrossRef]
Krivulin, N. Algebraic solution of weighted minimax single-facility constrained location problems. In Relational and Algebraic Methods in Computer Science; Lecture Notes in Computer Science; Desharnais, J., Guttmann, W., Joosten, S., Eds.; Springer: Cham, Switzerland, 2018; Volume 11194, pp. 317–332. [Google Scholar] [CrossRef]
Krivulin, N. Algebraic solution of minimax single-facility constrained location problems with Chebyshev and rectilinear distances. J. Log. Algebr. Methods Program. 2020, 115, 100578. [Google Scholar] [CrossRef]

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2020 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Krivulin, N. Using Parameter Elimination to Solve Discrete Linear Chebyshev Approximation Problems. Mathematics 2020, 8, 2210. https://doi.org/10.3390/math8122210

AMA Style

Krivulin N. Using Parameter Elimination to Solve Discrete Linear Chebyshev Approximation Problems. Mathematics. 2020; 8(12):2210. https://doi.org/10.3390/math8122210

Chicago/Turabian Style

Krivulin, Nikolai. 2020. "Using Parameter Elimination to Solve Discrete Linear Chebyshev Approximation Problems" Mathematics 8, no. 12: 2210. https://doi.org/10.3390/math8122210

APA Style

Krivulin, N. (2020). Using Parameter Elimination to Solve Discrete Linear Chebyshev Approximation Problems. Mathematics, 8(12), 2210. https://doi.org/10.3390/math8122210

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Using Parameter Elimination to Solve Discrete Linear Chebyshev Approximation Problems

Abstract

1. Introduction

2. Linear Chebyshev Approximation Problem

3. Elimination Lemma

4. Solution of One- and Two-Parameter Regression Problems

4.1. One-Parameter Linear Regression Problems

4.2. Two-Parameter Linear Regression Problem

5. General Solution Procedure

5.1. Elimination of Parameters

5.2. Derivation of Box Constraints

5.3. Solution Algorithm

5.4. Computational Complexity

6. Software Implementation and Numerical Examples

7. Conclusions

Funding

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI