An Improvement of Least Squares Theory: Theory of Least p-Variances Approximation and p-Uncorrelated Functions

Masjed-Jamei, Mohammad

doi:10.3390/math13142255

Open AccessArticle

An Improvement of Least Squares Theory: Theory of Least p-Variances Approximation and p-Uncorrelated Functions

by

Mohammad Masjed-Jamei

^1,2

¹

Faculty of Mathematics, K. N. Toosi University of Technology, Tehran P.O. Box 16315-1618, Iran

²

Alexander von Humboldt Foundation, 53173 Bonn, Germany

Mathematics 2025, 13(14), 2255; https://doi.org/10.3390/math13142255

Submission received: 11 May 2025 / Revised: 29 June 2025 / Accepted: 7 July 2025 / Published: 11 July 2025

Download

Browse Figure

Versions Notes

Abstract

We establish a theory whose structure is based on a fixed variable and an algebraic inequality and which improves the well-known least squares theory. The mentioned fixed variable plays a basic role in creating such a theory. In this direction, some new concepts, such as p-covariances with respect to a fixed variable, p-correlation coefficients with respect to a fixed variable, and p-uncorrelatedness with respect to a fixed variable, are defined in order to establish least p-variance approximations. We then obtain a specific system, called the p-covariances linear system, and apply the p-uncorrelatedness condition on its elements to find a general representation for p-uncorrelated variables. Afterwards, we apply the concept of p-uncorrelatedness for continuous functions, particularly for polynomial sequences, and we find some new sequences, such as a generic two-parameter hypergeometric polynomial of the

{}_{4}F_{3}

type that satisfies a p-uncorrelatedness property. In the sequel, we obtain an upper bound for 1-covariances, an improvement to the approximate solutions of over-determined systems and an improvement to the Bessel inequality and Parseval identity. Finally, we generalize the concept of least p-variance approximations based on several fixed orthogonal variables.

Keywords:

Least p-Variance approximations; least squares theory; p-Covariances and p-Correlation coefficients; p-Uncorrelatedness with respect to a fixed variable; Hypergeometric polynomials; generalized Gram-Schmidt orthogonalization process

MSC:

93E24; 62J10; 42C05; 33C47; 65F25

1. Introduction

Least squares theory is known in the literature as an essential tool for optimal approximations and regression analysis, and it has found extensive applications in mathematics, statistics, physics and engineering [1,2,3,4]. Some important parts of this theory include the following: the best linear unbiased estimator, Gauss–Markov theorem, the moving least squares method and its improvements [5,6,7], least-squares spectral analysis, orthogonal projections and least squares function and polynomial approximations [8]. See also [9] for synergy interval partial least squares algorithms. It was officially discovered and published by A. M. Legendre [10] in 1805, though it is also co-credited to C. F. Gauss, who contributed significant theoretical advances to the method and may have previously used it in his work in 1795 [11]. The base of this theory is to minimize the sum of the squares of the residuals, which are the difference between observed values and the fitted values provided via an approximate linear (or nonlinear) model. In many linear models [12], in order to reduce the influence of errors in the derived observations, one would like to use a greater number of samplings than the number of unknown parameters in the model, which leads to so-called overdetermined linear systems. In other words, let

b \in R^{m}

be a given vector and

A \in R^{m \times n}

for

m > n

a given matrix. The problem is to find a vector,

x \in R^{n}

, such that

A x

is the best approximation to b. There are many possible ways of defining the best solution; see [13]. A very common choice is to let x be a solution of the minimization problem

min_{x} {∥A x - b∥}_{2}^{2}

, where

{‖ . ‖}_{2}

denotes the Euclidean vector norm. Now, if one refers to

r = b - A x

as the residual vector, a least squares solution in fact minimizes

{∥r∥}_{2}^{2} = \sum_{i = 1}^{m} r_{i}^{2}

. In linear statistical models, one assumes that the vector

b \in R^{m}

of observations is related to the unknown parameter vector

x \in R^{n}

through a linear relation

A x = b + e,

(1)

where

A \in R^{m \times n}

is a predetermined matrix of rank n, and e denotes a vector of random errors. In the standard linear model, the following conditions are known as the basic conditions of Gauss–Markov theorem:

E (e) = 0 and var (e) = σ^{2} I_{n},

(2)

i.e., the random errors,

e_{i}

, are uncorrelated and all have zero means with the same variance, in which

σ^{2}

is the true error variance. In summary, the Gauss–Markov theorem expresses that, if the linear model (1) is available, where e is a random vector with mean and variance given by (2), then the optimal solution of (1) is the least squares estimator, obtained by minimizing the sum of squares

{∥A x - b - e∥}_{2}^{2}

. Furthermore,

E (s^{2}) = σ^{2}

, where

s^{2} = \frac{1}{m - n} {∥b - A x - A e∥}_{2}^{2}

. Similar to other approximate techniques, the least squares method contains some constrains and limitations, just like in the above-mentioned theorem, such that the conditions (2) are necessary. In this research, we introduce a new fixed variable in the minimization problem to somehow miniaturize the primitive quantity associated with sampling errors. For this purpose, we begin with a basic identity which gives rise to an important algebraic inequality, too. Let

S

denote an inner product space. Knowing that the elements of

S

satisfy four properties of linearity, symmetry, homogeneity and positivity, there is an identity in this space that is directly related to the definition of the projection of two arbitrary elements of

S

, i.e.,

pro j_{x_{2}} x_{1} = \frac{〈x_{1}, x_{2}〉}{〈x_{2}, x_{2}〉} x_{2},

(3)

where

〈x_{1}, x_{2}〉

indicates the inner product of

x_{1}

and

x_{2}

and

〈x_{2}, x_{2}〉 = {∥x_{2}∥}^{2} \geq 0

denotes the norm square value of

x_{2}

. In other words, suppose that

x, y, z

are three specific elements of

S

, and

p \in [0, 1]

is a free parameter. Noting (3), the following identity holds true:

〈x - (1 - \sqrt{1 - p}) pro j_{z} x, y - (1 - \sqrt{1 - p}) pro j_{z} y〉 = 〈x, y〉 - p \frac{〈x, z〉 〈y, z〉}{〈z, z〉},

(4)

and, naturally, for

y = x

, it leads to the Schwarz inequality

〈x, x〉 - p \frac{{〈x, z〉}^{2}}{〈z, z〉} \geq 0, \forall p \in [0, 1] .

(5)

This identity can also be considered in mathematical statistics. Essentially, as the content of this research is about the least variance of an approximation based on a fixed variable, if we equivalently use the expected value symbol

E (.)

instead of the inner product symbol

〈.〉

, though there is no difference between them for use, we will get to a new definition of covariance concept, as follows:

Definition 1.

p-Covariances with respect to a fixed variable.

Let

X, Y

and Z be three random variables and

p \in [0, 1]

. With reference to (4), we correspondingly define

\begin{matrix} {cov}_{p} (X, Y; Z) & = E ((X - (1 - \sqrt{1 - p}) \frac{E (X Z)}{E (Z^{2})} Z) (Y - (1 - \sqrt{1 - p}) \frac{E (Y Z)}{E (Z^{2})} Z)) \\ = E (X Y) - p \frac{E (X Z) E (Y Z)}{E (Z^{2})}, \end{matrix}

(6)

and call it “p-covariance of X and Y with respect to the fixed variable Z”.

Note, in (6), that

\frac{E (X Z)}{E (Z^{2})} Z = pro j_{Z} X,

and, therefore, e.g., for

p = 1

,

{cov}_{1} (X, Y; Z) = E ((X - pro j_{Z} X) (Y - pro j_{Z} Y)) .

If

p = 1

and Z follows a constant distribution, say

Z = c^{*}

, where

c^{*} \neq 0

, then (6) will reduce to an ordinary covariance as follows:

{cov}_{1} (X, Y; c^{*}) = E (X Y) - E (X) E (Y) = cov (X, Y) .

Also, for

p = 0

, it follows that the fixed variable Z has no effect on Definition (6).

For

Y = X

in (6), an extended definition of the ordinary variance is concluded as

{var}_{p} (X; Z) = {cov}_{p} (X, X; Z) = E (X^{2}) - p \frac{E^{2} (X Z)}{E (Z^{2})} \geq 0,

(7)

where

E (Z^{2})

is always to be positive.

Moreover, for any fixed variable Z and

p \in [0, 1]

, we have

0 \leq {var}_{1} (X; Z) \leq {var}_{p} (X; Z) \leq {var}_{0} (X; Z) = E (X^{2}),

(8)

which is equivalent to

\begin{matrix} 0 \leq 〈x - \frac{〈x, z〉}{〈z, z〉} z, x - \frac{〈x, z〉}{〈z, z〉} z〉 \\ \leq 〈x - (1 - \sqrt{1 - p}) \frac{〈x, z〉}{〈z, z〉} z, x - (1 - \sqrt{1 - p}) \frac{〈x, z〉}{〈z, z〉} z〉 \leq 〈x, x〉 \forall z \in S, \end{matrix}

in an inner product space. Note that, after simplification, the second part of the latter inequality is in the same form as (5). Although inequality (8) shows that the best option is

p = 1

, we prefer to apply the parametric case

p \in [0, 1]

throughout this paper. The reasons for making such a decision will be revealed in forthcoming sections (see the illustrative Section 2.4 in this regard). The following properties hold true for Definitions (6) and (7):

(a1)

{cov}_{p} (X, Y; Z) = {cov}_{p} (Y, X; Z) .

(a2)

{cov}_{p} (α X, β Y; Z) = α β {cov}_{p} (X, Y; Z) (α, β \in R) .

(a3)

{cov}_{p} (X + α, Y + β; Z) = {cov}_{p} (X, Y; Z) + α {cov}_{p} (1, Y; Z) + β {cov}_{p} (X, 1; Z) + α β {cov}_{p} (1, 1; Z) .

(a4)

{var}_{p} (X; Z) = 0 if p = 1 and Z = c^{*} X (c^{*} \neq 0) .

(a5)

{cov}_{p} (X, Y; c^{*} X) = {cov}_{p} (X, Y; c^{*} Y) = (1 - p) E (X Y) (c^{*} \neq 0) .

(a6)

{cov}_{p} (\sum_{k = 0}^{n} c_{k} X_{k}, X_{m}; Z) = \sum_{k = 0}^{n} c_{k} {cov}_{p} (X_{k}, X_{m}; Z) ({c_{k}}_{k = 0}^{n} \in R) .

and

${var}_{p} (α X + β Y; Z) = α^{2} {var}_{p} (X; Z) + β^{2} {var}_{p} (Y; Z) + 2 α β {cov}_{p} (X, Y; Z) .$

(9)

Definition 2.

p-Correlation coefficient with respect to a fixed variable.

Based on relations (6) and (7), we can define a generalization of the Pearson correlation coefficient as

ρ_{p} (X, Y; Z) = \frac{{cov}_{p} (X, Y; Z)}{\sqrt{{var}_{p} (X; Z) {var}_{p} (Y; Z)}},

(10)

and call it “p-correlation coefficient of X and Y with respect to the fixed variable Z”.

It is straightforward to observe that

ρ_{p} (X, Y; Z) \in [- 1, 1]

, because, if the values

U = X - (1 - \sqrt{1 - p}) \frac{E (X Z)}{E (Z^{2})} Z and V = Y - (1 - \sqrt{1 - p}) \frac{E (Y Z)}{E (Z^{2})} Z,

are replaced in the Cauchy–Schwarz inequality [14]

E^{2} (U V) \leq E (U^{2}) E (V^{2}),

(11)

then

{cov}_{p}^{2} (X, Y; Z) \leq {var}_{p} (X; Z) {var}_{p} (Y; Z)

. In this sense, note that

E (U Z) = \sqrt{1 - p} E (X Z) and E (V Z) = \sqrt{1 - p} E (Y Z),

which are equal to zero only for the case

p = 1

.

Definition 3.

p-Normal standard variable with respect to a fixed variable.

Noting the Definitions (10) and (6), since

ρ_{p} (X, Y; Z) = E ((\frac{X - (1 - \sqrt{1 - p}) pro j_{Z} X}{\sqrt{{var}_{p} (X; Z)}}) (\frac{Y - (1 - \sqrt{1 - p}) pro j_{Z} Y}{\sqrt{{var}_{p} (Y; Z)}})),

a p-Normal standard variable, say

N_{p} (X; Z)

can be defined with respect to the fixed variable Z as

N_{p} (X; Z) = \frac{X - (1 - \sqrt{1 - p}) pro j_{Z} X}{\sqrt{{var}_{p} (X; Z)}} .

For instance, we have

N_{1} (X; Z = c^{*}) = \frac{X - E (X)}{\sqrt{var (X)}} .

Definition 4.

p-Uncorrelatedness with respect to a fixed variable.

If, in (10)

ρ_{p} (X, Y; Z) = 0

, which is equivalent to the condition

p E (X Z) E (Y Z) = E (X Y) E (Z^{2}),

then we say that X and Y are p-uncorrelated with respect to the fixed variable Z.

Such a definition can be expressed in an inner product space, too. We say that two elements,

x, y \in S

, are p-uncorrelated with respect to the fixed element

z \in S

if

(x - (1 - \sqrt{1 - p}) pro j_{z} x) ⊥ (y - (1 - \sqrt{1 - p}) pro j_{z} y),

(12)

or, equivalently,

p 〈x, z〉 〈y, z〉 = 〈x, y〉 〈z, z〉 .

As (12) shows,

p = 0

gives rise to the well-known orthogonality property. In summary, 0-uncorrelatedness gives the orthogonality notion,

p = 1

results in complete uncorrelatedness and

p \in (0, 1)

leads to incomplete uncorrelatedness. We will use the phrase “complete uncorrelated” instead of 1-uncorrelated.

The aforesaid definition can similarly be expressed in a probability space. We say that two events, A and B, are p-independent with respect to the event C if

p P_{r} (A| C) P_{r} (B| C) = P_{r} (A \cap B),

which is equivalent to

p P_{r} (A \cap C) P_{r} (B \cap C) = P_{r} (A \cap B) P_{r}^{2} (C) .

Hence, e.g., 1-independent means a complete independent with respect to the event C.

2. Least p-Variances Approximation Based on a Fixed Variable

Let

{X_{k}}_{k = 0}^{n}

and Y be arbitrary random variables, and consider the following approximation in the sequel

Y ≅ \sum_{k = 0}^{n} c_{k} X_{k},

(13)

in which

{c_{k}}_{k = 0}^{n}

are unknown coefficients to be appropriately determined.

According to (7), the p-variance of the remaining term

R (c_{0}, c_{1}, \dots, c_{n}) = \sum_{k = 0}^{n} c_{k} X_{k} - Y,

(14)

is defined with respect to the fixed variable Z as follows:

\begin{matrix} {var}_{p} (R (c_{0}, \dots, c_{n}); Z) = E ({(R (c_{0}, \dots, c_{n}) - (1 - \sqrt{1 - p}) \frac{E (Z R (c_{0}, \dots, c_{n}))}{E (Z^{2})} Z)}^{2}) \\ = E ({(\sum_{k = 0}^{n} c_{k} (X_{k} - (1 - \sqrt{1 - p}) \frac{E (Z X_{k})}{E (Z^{2})} Z) - (Y - (1 - \sqrt{1 - p}) \frac{E (Z Y)}{E (Z^{2})} Z))}^{2}), \end{matrix}

(15)

where

\frac{E (Z R (c_{0}, \dots, c_{n}))}{E (Z^{2})} Z

shows the projection of the error term (14) on the fixed variable Z and

\frac{E (Z X_{k})}{E (Z^{2})} Z

the projection of each elements on Z. For the special case

p = 1

, we have in fact considered (15) as

{var}_{1} (R (c_{0}, \dots, c_{n}); Z) = E ({(R (c_{0}, \dots, c_{n}) - pro j_{z} R (c_{0}, \dots, c_{n}))}^{2}) .

We wish to find the unknown coefficients

{c_{k}}_{k = 0}^{n}

in (15), such that

{var}_{p} (R (c_{0}, \dots, c_{n}); Z)

is minimized. In this direction, it is important to point out that, according to inequality (8), the following inequalities always hold for any arbitrary variable Z:

0 \leq {var}_{1} (R (c_{0}, \dots, c_{n}); Z) \leq {var}_{p} (R (c_{0}, \dots, c_{n}); Z) \leq {var}_{0} (R (c_{0}, \dots, c_{n}); Z) = E (R^{2} (c_{0}, \dots, c_{n})) .

(16)

This means that inequalities (16) are valid for any free selection of

{c_{k}}_{k = 0}^{n}

, especially when they minimize the quantity (15). In other words, we have

min_{c_{0}, \dots, c_{n}} {var}_{1} (R (c_{0}, \dots, c_{n}); Z) \leq min_{c_{0}, \dots, c_{n}} {var}_{p} (R (c_{0}, \dots, c_{n}); Z) \leq min_{c_{0}, \dots, c_{n}} E (R^{2} (c_{0}, \dots, c_{n})),

which shows the superiority of the present theory with respect to the ordinary least squares theory (see Section 2.4 and Section 3 in this regard).

To minimize (15), for every

j = 0, 1, \dots, n

we have

\begin{matrix} \frac{\partial {var}_{p} (R (c_{0}, \dots, c_{n}); Z)}{\partial c_{j}} = 0 \Rightarrow 2 E ((X_{j} - (1 - \sqrt{1 - p}) \frac{E (X_{j} Z)}{E (Z^{2})} Z) \\ (\sum_{k = 0}^{n} c_{k} (X_{k} - (1 - \sqrt{1 - p}) \frac{E (X_{k} Z)}{E (Z^{2})} Z) - (Y - (1 - \sqrt{1 - p}) \frac{E (Y Z)}{E (Z^{2})} Z))) = 0, \end{matrix}

(17)

leading to the linear system

\begin{matrix} [\begin{matrix} \begin{matrix} {var}_{p} (X_{0}; Z) \\ {cov}_{p} (X_{1}, X_{0}; Z) \\ ⋮ \\ {cov}_{p} (X_{n}, X_{0}; Z) \end{matrix} & \begin{matrix} {cov}_{p} (X_{0}, X_{1}; Z) \\ {var}_{p} (X_{1}; Z) \\ ⋮ \\ {cov}_{p} (X_{n}, X_{1}; Z) \end{matrix} & \begin{matrix} \dots \\ \dots \\ ⋮ \\ \dots \end{matrix} & \begin{matrix} {cov}_{p} (X_{0}, X_{n}; Z) \\ {cov}_{p} (X_{1}, X_{n}; Z) \\ ⋮ \\ {var}_{p} (X_{n}; Z) \end{matrix} \end{matrix}] [\begin{matrix} c_{0} \\ c_{1} \\ ⋮ \\ c_{n} \end{matrix}] \\ = [\begin{matrix} {cov}_{p} (X_{0}, Y; Z) \\ {cov}_{p} (X_{1}, Y; Z) \\ ⋮ \\ {cov}_{p} (X_{n}, Y; Z) \end{matrix}] . \end{matrix}

(18)

Notice that only finding the analytical solution of the above system is sufficient to guarantee that

{var}_{p} (R (c_{0}, \dots, c_{n}); Z)

is minimized, because we automatically have

{\{\frac{\partial^{2} {var}_{p} (R (c_{0}, \dots, c_{n}); Z)}{\partial {c_{j}}^{2}}\}}_{j = 0}^{n} \geq 0 .

For instance, after solving (2.6) for

n = 1

, the approximation (13) takes the form

\begin{matrix} Y ≅ \frac{{var}_{p} (X_{1}; Z) {cov}_{p} (X_{0}, Y; Z) - {cov}_{p} (X_{0}, X_{1}; Z) {cov}_{p} (X_{1}, Y; Z)}{{var}_{p} (X_{1}; Z) {var}_{p} (X_{0}; Z) - {cov}_{p}^{2} (X_{0}, X_{1}; Z)} X_{0} \\ + \frac{{var}_{p} (X_{0}; Z) {cov}_{p} (X_{1}, Y; Z) - {cov}_{p} (X_{0}, X_{1}; Z) {cov}_{p} (X_{0}, Y; Z)}{{var}_{p} (X_{1}; Z) {var}_{p} (X_{0}; Z) - {cov}_{p}^{2} (X_{0}, X_{1}; Z)} X_{1} . \end{matrix}

In general, two continuous and discrete cases can be considered for the system (18), which we call from now on a “p-covariances linear system with respect to the fixed variable Z”.

2.1. Continuous Case of p-Covariances Linear System

If

{\{X_{k} = Φ_{k} (x)\}}_{k = 0}^{n}

,

Y = f (x)

and

Z = z (x)

are defined in a continuous space with a probability density function as

P_{r} (X = x) = \frac{w (x)}{\int_{a}^{b} w (x) d x}

, for any arbitrary function

w (x)

positive on the interval

[a, b]

, then the components of the linear system (18) appear as

\begin{array}{l} {cov}_{p} (Φ_{i} (x), Φ_{j} (x); z (x)) = \\ \frac{1}{\int_{a}^{b} w (x) d x} (\int_{a}^{b} w (x) Φ_{i} (x) Φ_{j} (x) d x - p \frac{\int_{a}^{b} w (x) Φ_{i} (x) z (x) d x \int_{a}^{b} w (x) Φ_{j} (x) z (x) d x}{\int_{a}^{b} w (x) z^{2} (x) d x}), \end{array}

(19)

where

\int_{a}^{b} w (x) z^{2} (x) d x > 0

.

2.2. Discrete Case of p-Covariances Linear System

If the above-mentioned variables are defined on a counter set, say

A^{*} = {x_{k}}_{k = 0}^{m}

with a discrete probability density function as

P_{r} (X = x) = \frac{j (x)}{\sum_{x \in A^{*}} j (x)}

, for any arbitrary function

j (x)

positive on

x \in A^{*}

, then

\begin{matrix} {cov}_{p} (Φ_{i} (x), Φ_{j} (x); z (x)) = \\ \frac{1}{\sum_{x \in A^{*}} j (x)} (\sum_{x \in A^{*}} j (x) Φ_{i} (x) Φ_{j} (x) - p \frac{\sum_{x \in A^{*}} j (x) Φ_{i} (x) z (x) \sum_{x \in A^{*}} j (x) Φ_{j} (x) z (x)}{\sum_{x \in A^{*}} j (x) z^{2} (x)}), \end{matrix}

(20)

where

\sum_{x \in A^{*}} j (x) z^{2} (x) > 0

.

One of the important cases of the approximation (13) is when

{\{X_{k} = Φ_{k} (x) = x^{k}\}}_{k = 0}^{n}

has a constant distribution function, which leads to the Hilbert problem [13,15] in a continuous space and to the polynomial type regression in a discrete space. In other words, if

{\{X_{k} = x^{k}\}}_{k = 0}^{n}

,

Y = f (x)

,

Z = z (x)

and

w (x) = 1

for

x \in [0, 1]

are substituted into (18), then

{cov}_{p} (x^{i}, x^{j}; z (x)) = \frac{1}{i + j + 1} - p \frac{\int_{0}^{1} x^{i} z (x) d x \int_{0}^{1} x^{j} z (x) d x}{\int_{0}^{1} z^{2} (x) d x} for i, j = 0, 1, \dots, n,

and

{cov}_{p} (x^{j}, f (x); z (x)) = \int_{0}^{1} x^{j} f (x) d x - p \frac{\int_{0}^{1} x^{j} z (x) d x \int_{0}^{1} f (x) z (x) d x}{\int_{0}^{1} z^{2} (x) d x} for j = 0, 1, \dots, n .

For

p = 0

, we clearly encounter with the Hilbert problem.

Similarly, if in a discrete space

{\{X_{k} = x^{k}\}}_{k = 0}^{n}

,

A^{*} = {x_{k}}_{k = 1}^{m}

,

Y = f (x)

where

f (x_{k}) = y_{k}

,

Z = z (x)

and

j (x) = 1

, the entries of the system (18), respectively, take the forms

{cov}_{p} (x^{i}, x^{j}; z (x)) = \frac{1}{m} \sum_{k = 1}^{m} {(x_{k})}^{i + j} - \frac{p}{m} \frac{\sum_{k = 1}^{m} {(x_{k})}^{i} z (x_{k}) \sum_{k = 1}^{m} {(x_{k})}^{j} z (x_{k})}{\sum_{k = 1}^{m} z^{2} (x_{k})} for i, j = 0, 1, \dots, n,

and

{cov}_{p} (x^{j}, f (x); z (x)) = \frac{1}{m} \sum_{k = 1}^{m} y_{k} {(x_{k})}^{j} - \frac{p}{m} \frac{\sum_{k = 1}^{m} {(x_{k})}^{j} z (x_{k}) \sum_{k = 1}^{m} y_{k} z (x_{k})}{\sum_{k = 1}^{m} z^{2} (x_{k})} for j = 0, 1, \dots, n .

As a particular sample, the system corresponding to the linear regression

Y = c_{0} X_{0} + c_{1} X_{1} = c_{0} + c_{1} x

with respect to the fixed variable

Z = z (x)

appears as

\begin{matrix} [\begin{matrix} m \sum_{k = 1}^{m} z^{2} (x_{k}) - p (\sum_{k = 1}^{m} z (x_{k}))^{2}, & \sum_{k = 1}^{m} x_{k} \sum_{k = 1}^{m} z^{2} (x_{k}) - p \sum_{k = 1}^{m} z (x_{k}) \sum_{k = 1}^{m} x_{k} z (x_{k}) \\ \sum_{k = 1}^{m} x_{k} \sum_{k = 1}^{m} z^{2} (x_{k}) - p \sum_{k = 1}^{m} z (x_{k}) \sum_{k = 1}^{m} x_{k} z (x_{k}), & \sum_{k = 1}^{m} x_{k}^{2} \sum_{k = 1}^{m} z^{2} (x_{k}) - p (\sum_{k = 1}^{m} x_{k} z (x_{k}))^{2} \end{matrix}] \\ \times [\begin{matrix} c_{0} \\ c_{1} \end{matrix}] = [\begin{matrix} \sum_{k = 1}^{m} y_{k} \sum_{k = 1}^{m} z^{2} (x_{k}) - p \sum_{k = 1}^{m} z (x_{k}) \sum_{k = 1}^{m} y_{k} z (x_{k}) \\ \sum_{k = 1}^{m} x_{k} y_{k} \sum_{k = 1}^{m} z^{2} (x_{k}) - p \sum_{k = 1}^{m} x_{k} z (x_{k}) \sum_{k = 1}^{m} y_{k} z (x_{k}) \end{matrix}] . \end{matrix}

It is interesting to know that, if

z (x) = c^{*} \neq 0

is substituted into the above partially system for

p \neq 1

, the output gives the well-known result

Y = \frac{\sum_{k = 1}^{m} x_{k}^{2} \sum_{k = 1}^{m} y_{k} - \sum_{k = 1}^{m} x_{k} \sum_{k = 1}^{m} x_{k} y_{k}}{m \sum_{k = 1}^{m} x_{k}^{2} - {(\sum_{k = 1}^{m} x_{k})}^{2}} + \frac{m \sum_{k = 1}^{m} x_{k} y_{k} - \sum_{k = 1}^{m} x_{k} \sum_{k = 1}^{m} y_{k}}{m \sum_{k = 1}^{m} x_{k}^{2} - {(\sum_{k = 1}^{m} x_{k})}^{2}} x,

i.e.,

p \neq 1

has been automatically deleted. In this sense, if the only case

p = 1

is substituted into the above mentioned system for

z (x) = c^{*} \neq 0

as

[\begin{matrix} 0, & 0 \\ 0, & m \sum_{k = 1}^{m} x_{k}^{2} - {(\sum_{k = 1}^{m} x_{k})}^{2} \end{matrix}] [\begin{matrix} c_{0} \\ c_{1} \end{matrix}] = [\begin{matrix} 0 \\ m \sum_{k = 1}^{m} x_{k} y_{k} - \sum_{k = 1}^{m} x_{k} \sum_{k = 1}^{m} y_{k} \end{matrix}],

the output gives the result

Y = c_{0} + \frac{m \sum_{k = 1}^{m} x_{k} y_{k} - \sum_{k = 1}^{m} x_{k} \sum_{k = 1}^{m} y_{k}}{m \sum_{k = 1}^{m} x_{k}^{2} - {(\sum_{k = 1}^{m} x_{k})}^{2}} x

, where

c_{0}

is a free parameter.

In the next section, we show why

c_{0}

became a free parameter when

z (x) = c^{*} \neq 0

and

p = 1

.

2.3. Some Reducible Cases of the p-Covariances Linear System

2.3.1. First Case

Suppose there exists a specific distribution for the fixed variable Z, such that

E (Z X_{k}) = 0 for every k = 0, 1, \dots, n .

(21)

Then, the p-covariances system (18) reduces to the well-known normal equations system

[\begin{matrix} \begin{matrix} E (X_{0}^{2}) \\ E (X_{1} X_{0}) \\ ⋮ \\ E (X_{n} X_{0}) \end{matrix} & \begin{matrix} E (X_{0} X_{1}) \\ E (X_{1}^{2}) \\ ⋮ \\ E (X_{n} X_{1}) \end{matrix} & \begin{matrix} \dots \\ \dots \\ ⋮ \\ \dots \end{matrix} & \begin{matrix} E (X_{0} X_{n}) \\ E (X_{1} X_{n}) \\ ⋮ \\ E (X_{n}^{2}) \end{matrix} \end{matrix}] [\begin{matrix} c_{0} \\ c_{1} \\ ⋮ \\ c_{n} \end{matrix}] = [\begin{matrix} E (X_{0} Y) \\ E (X_{1} Y) \\ ⋮ \\ E (X_{n} Y) \end{matrix}],

(22)

and

min_{c_{0}, \dots, c_{n}} {var}_{p} (R (c_{0}, \dots, c_{n}); Z | {E (Z X_{k})}_{k = 0}^{n} = 0) = min_{c_{0}, \dots, c_{n}} E (R^{2} (c_{0}, \dots, c_{n})) .

(23)

Also, it is obvious for

p = 0

that

min_{c_{0}, \dots, c_{n}} {var}_{0} (R (c_{0}, \dots, c_{n}); Z) = min_{c_{0}, \dots, c_{n}} E (R^{2} (c_{0}, \dots, c_{n})) .

2.3.2. Second Case

If Z follows a constant distribution (say

Z = c^{*} \neq 0

) and

X_{0} = 1

, the condition (21) is no longer valid for

k = 0

because

c^{*} E (1) \neq 0

. Therefore, noting the fact that

{cov}_{p} (X, Y; c^{*}) = E (X Y) - p E (X) E (Y)

, the linear system (18) reduces to

\begin{matrix} [\begin{matrix} \begin{matrix} 1 - p \\ (1 - p) E (X_{1}) \\ ⋮ \\ (1 - p) E (X_{n}) \end{matrix} & \begin{matrix} (1 - p) E (X_{1}) \\ E (X_{1}^{2}) - p E^{2} (X_{1}) \\ ⋮ \\ E (X_{n} X_{1}) - p E (X_{n}) E (X_{1}) \end{matrix} & \begin{matrix} \dots \\ \dots \\ ⋮ \\ \dots \end{matrix} & \begin{matrix} (1 - p) E (X_{n}) \\ E (X_{1} X_{n}) - p E (X_{1}) E (X_{n}) \\ ⋮ \\ E (X_{n}^{2}) - p E^{2} (X_{n}) \end{matrix} \end{matrix}] \\ \times [\begin{matrix} c_{0} \\ c_{1} \\ ⋮ \\ c_{n} \end{matrix}] = [\begin{matrix} (1 - p) E (Y) \\ E (X_{1} Y) - p E (X_{1}) E (Y) \\ ⋮ \\ E (X_{n} Y) - p E (X_{n}) E (Y) \end{matrix}], \end{matrix}

(24)

Relation (24) shows that for

p = 1

,

c_{0}

is a free parameter. For

X_{0} = 1

, the approximation (13) takes the simplified form

Y ≅ c_{0} + \sum_{k = 1}^{n} c_{k} X_{k}

, and replacing it in the normal system (22) yields

[\begin{matrix} \begin{matrix} 1 \\ E (X_{1}) \\ ⋮ \\ E (X_{n}) \end{matrix} & \begin{matrix} E (X_{1}) \\ E (X_{1}^{2}) \\ ⋮ \\ E (X_{n} X_{1}) \end{matrix} & \begin{matrix} \dots \\ \dots \\ ⋮ \\ \dots \end{matrix} & \begin{matrix} E (X_{n}) \\ E (X_{1} X_{n}) \\ ⋮ \\ E (X_{n}^{2}) \end{matrix} \end{matrix}] [\begin{matrix} c_{0} \\ c_{1} \\ ⋮ \\ c_{n} \end{matrix}] = [\begin{matrix} E (Y) \\ E (X_{1} Y) \\ ⋮ \\ E (X_{n} Y) \end{matrix}] .

(25)

Now, the important point is that, after some elementary operations, the system (25) will exactly be transformed into the linear system (24). This means that, for any arbitrary

p \in [0, 1]

, we have

\begin{matrix} min_{c_{1}, \dots, c_{n}} {var}_{p} (R (c_{0}, \dots, c_{n}); Z | Z = c^{*} and X_{0} = 1) \\ = min_{c_{1}, \dots, c_{n}} E (R^{2} (c_{0}, \dots, c_{n}) | X_{0} = 1 and c_{0} = E (Y) - \sum_{k = 1}^{n} c_{k} E (X_{k})) . \end{matrix}

(26)

Such a result in (26) can similarly be proved for every

Z = X_{k}

where

k = 1, 2, \dots, n

, as the following example proves.

2.4. An Illustrative Example and the Role of the Fixed Variable in It

The previous Section 2.1, Section 2.2 and Section 2.3 affirm that the fixed variable Z plays a basic role in the theory of least p-variances. Here, we present an illustrative example to show the importance of such a fixed function in constituting the initial approximation (13). Let

Y = \sqrt{1 - x}

be defined on

[0, 1]

together with the probability density function

w (x) = 1

and the fixed function

z (x) = x^{λ}

, where

λ > - 1 / 2

because

\int_{0}^{1} w (x) z^{2} (x) d x = \frac{1}{2 λ + 1}

. For the basis functions

{\{X_{k} = x^{k}\}}_{k = 0}^{2}

, the initial approximation (13) takes the form

\sqrt{1 - x} ≅ c_{0} (λ, p) + c_{1} (λ, p) x + c_{2} (λ, p) x^{2} = A_{2} (x; λ, p),

(27)

in which the unknown coefficients satisfy the following linear system according to (18):

\begin{matrix} [\begin{matrix} 1 - \frac{2 λ + 1}{{(λ + 1)}^{2}} p, & \frac{1}{2} - \frac{2 λ + 1}{(λ + 1) (λ + 2)} p, & \frac{1}{3} - \frac{2 λ + 1}{(λ + 1) (λ + 3)} p \\ \frac{1}{2} - \frac{2 λ + 1}{(λ + 1) (λ + 2)} p, & \frac{1}{3} - \frac{2 λ + 1}{{(λ + 2)}^{2}} p, & \frac{1}{4} - \frac{2 λ + 1}{(λ + 2) (λ + 3)} p \\ \frac{1}{3} - \frac{2 λ + 1}{(λ + 1) (λ + 3)} p, & \frac{1}{4} - \frac{2 λ + 1}{(λ + 2) (λ + 3)} p, & \frac{1}{5} - \frac{2 λ + 1}{{(λ + 3)}^{2}} p \end{matrix}] [\begin{matrix} c_{0} (λ, p) \\ c_{1} (λ, p) \\ c_{2} (λ, p) \end{matrix}] \\ = [\begin{matrix} \frac{2}{3} - \frac{2 λ + 1}{λ + 1} B (λ + 1, \frac{3}{2}) p \\ \frac{4}{15} - \frac{2 λ + 1}{λ + 2} B (λ + 1, \frac{3}{2}) p \\ \frac{16}{105} - \frac{2 λ + 1}{λ + 3} B (λ + 1, \frac{3}{2}) p \end{matrix}], \end{matrix}

(28)

where

B (a, b) = \int_{0}^{1} x^{a - 1} {(1 - x)}^{b - 1} d x = \frac{Γ (a) Γ (b)}{Γ (a + b)} (a, b > 0) .

After solving (28),

A_{2} (x; λ, p)

will be determined, and since the remaining term of (27) is defined as

R_{2} (x; λ, p) = A_{2} (x; λ, p) - \sqrt{1 - x},

so

{var}_{p} (R_{2} (x; λ, p); x^{λ}) = E ({(R_{2} (x; λ, p))}^{2}) - p (2 λ + 1) E^{2} (x^{λ} R_{2} (x; λ, p)) .

However, there are two different cases for the parameter

λ > - 1 / 2

, which should be studied separately.

2.4.1. First Case

If

λ = 0, 1, 2

, then the fixed functions

z (x) = 1, x, x^{2}

coincide, respectively, with the first, second, and third basis functions in (27), and the system (28) is simplified for

λ = 0

as

[\begin{matrix} 1 - p, & \frac{1}{2} - \frac{1}{2} p, & \frac{1}{3} - \frac{1}{3} p \\ \frac{1}{2} - \frac{1}{2} p, & \frac{1}{3} - \frac{1}{4} p, & \frac{1}{4} - \frac{1}{6} p \\ \frac{1}{3} - \frac{1}{3} p, & \frac{1}{4} - \frac{1}{6} p, & \frac{1}{5} - \frac{1}{9} p \end{matrix}] [\begin{matrix} c_{0} (0, p) \\ c_{1} (0, p) \\ c_{2} (0, p) \end{matrix}] = [\begin{matrix} \frac{2}{3} - \frac{2}{3} p \\ \frac{4}{15} - \frac{1}{3} p \\ \frac{16}{105} - \frac{2}{9} p \end{matrix}],

and for

λ = 1

as

[\begin{matrix} 1 - \frac{3}{4} p, & \frac{1}{2} - \frac{1}{2} p, & \frac{1}{3} - \frac{3}{8} p \\ \frac{1}{2} - \frac{1}{2} p, & \frac{1}{3} - \frac{1}{3} p, & \frac{1}{4} - \frac{1}{4} p \\ \frac{1}{3} - \frac{3}{8} p, & \frac{1}{4} - \frac{1}{4} p, & \frac{1}{5} - \frac{3}{16} p \end{matrix}] [\begin{matrix} c_{0} (1, p) \\ c_{1} (1, p) \\ c_{2} (1, p) \end{matrix}] = [\begin{matrix} \frac{2}{3} - \frac{2}{5} p \\ \frac{4}{15} - \frac{4}{15} p \\ \frac{16}{105} - \frac{1}{5} p \end{matrix}],

and finally for

λ = 2

as

[\begin{matrix} 1 - \frac{5}{9} p, & \frac{1}{2} - \frac{5}{12} p, & \frac{1}{3} - \frac{1}{3} p \\ \frac{1}{2} - \frac{5}{12} p, & \frac{1}{3} - \frac{5}{16} p, & \frac{1}{4} - \frac{1}{4} p \\ \frac{1}{3} - \frac{1}{3} p, & \frac{1}{4} - \frac{1}{4} p, & \frac{1}{5} - \frac{1}{5} p \end{matrix}] [\begin{matrix} c_{0} (2, p) \\ c_{1} (2, p) \\ c_{2} (2, p) \end{matrix}] = [\begin{matrix} \frac{2}{3} - \frac{16}{63} p \\ \frac{4}{15} - \frac{4}{21} p \\ \frac{16}{105} - \frac{16}{105} p \end{matrix}] .

According to the result (26), the solutions of all above systems must be the same and independent of p so that, in the final form, we have

A_{2} (x; 0, p) = A_{2} (x; 1, p) = A_{2} (x; 2, p) = A_{2} (x; λ, 0) = - \frac{4}{7} x^{2} - \frac{8}{35} x + \frac{34}{35} .

Note for

p = 1

, after solving (28), we get

A_{2} (x; 0, 1) = - \frac{4}{7} x^{2} - \frac{8}{35} x + c_{0} (0, 1),

in which

c_{0} (0, 1) = E (\sqrt{1 - x}) + \frac{4}{7} E (x^{2}) + \frac{8}{35} E (x) = \frac{34}{35}

according to (26) again.

In this sense,

E (R_{2} (x; 0, 1)) = E (A_{2} (x; 0, 1) - \sqrt{1 - x}) = 0

, implies that

\begin{matrix} {var}_{1} (R_{2} (x; 0, 1); z (x) = 1) = E ({(R_{2} (x; 0, 1))}^{2}) \\ = \int_{0}^{1} {(- \frac{4}{7} x^{2} - \frac{8}{35} x + \frac{34}{35} - \sqrt{1 - x})}^{2} d x = \frac{1}{2450} ≅ 0.000408 . \end{matrix}

2.4.2. Second Case

Any assumption other than

λ = 0, 1, 2

causes the system (28) to have a unique solution. For example,

λ = 1 / 2

simplifies (28) in the form

[\begin{matrix} 1 - \frac{8}{9} p, & \frac{1}{2} - \frac{8}{15} p, & \frac{1}{3} - \frac{8}{21} p \\ \frac{1}{2} - \frac{1}{2} p, & \frac{1}{3} - \frac{8}{25} p, & \frac{1}{4} - \frac{8}{35} p \\ \frac{1}{3} - \frac{1}{3} p, & \frac{1}{4} - \frac{8}{35} p, & \frac{1}{5} - \frac{8}{49} p \end{matrix}] [\begin{matrix} c_{0} (\frac{1}{2}, p) \\ c_{1} (\frac{1}{2}, p) \\ c_{2} (\frac{1}{2}, p) \end{matrix}] = [\begin{matrix} \frac{2}{3} - \frac{1}{6} π p \\ \frac{4}{15} - \frac{1}{10} π p \\ \frac{16}{105} - \frac{1}{14} π p \end{matrix}] .

(29)

After solving the system (29), we obtain

A_{2} (x; \frac{1}{2}, p) = - \frac{7}{3} \frac{(75 π + 64) p - 300}{1224 p - 1225} x^{2} + 20 \frac{(21 π - 80) p + 14}{1224 p - 1225} x + \frac{1}{2} \frac{(105 π + 2048) p - 2380}{1224 p - 1225} .

For the special case

p = 1

, i.e.,

A_{2} (x; \frac{1}{2}, 1) = - (\frac{1652}{3} - 175 π) x^{2} + (1320 - 420 π) x + 166 - \frac{105}{2} π,

we have

E (x^{\frac{1}{2}} R_{2} (x; \frac{1}{2}, 1)) = - \frac{2}{7} (\frac{1652}{3} - 175 π) + \frac{2}{5} (1320 - 420 π) + \frac{2}{3} (166 - \frac{105}{2} π) - \frac{π}{8},

and consequently

{var}_{1} (R_{2} (x; \frac{1}{2}, 1); x^{\frac{1}{2}}) = E ({(R_{2} (x; \frac{1}{2}, 1))}^{2}) - 2 E^{2} (x^{\frac{1}{2}} R_{2} (x; \frac{1}{2}, 1)) ≅ 0.000388 .

As we observe

{var}_{1} (R_{2} (x; \frac{1}{2}, 1); x^{\frac{1}{2}}) < {var}_{1} (R_{2} (x; 0, 1); 1) = E ({(R_{2} (x; 0, 1))}^{2}),

which clearly shows the role of the fixed function in the obtained approximations.

2.5. How to Choose a Suitable Option for the Fixed Variable: A Geometric Interpretation

Although inequality (28) holds for any arbitrary variable Z, our wish is to determine some conditions to find a suitable option for the fixed variable Z, such that

{var}_{1} (X; Z) ≪ E (X^{2}),

(30)

and/or

\frac{E^{2} (X Z)}{E (Z^{2})} ≫ 0 .

(31)

Various states can be considered for the above-mentioned goal. For example, replacing

Z = c^{*} \neq 0

in (31) yields

E^{2} (X) ≫ 0

which means that, if the magnitude of the mean value

E (X) \neq 0

is very big, we expect that the presented theory based on the fixed variable

Z = c^{*} \neq 0

acts much better than the ordinary least squares theory. Another approach is to directly minimize the value

{var}_{1} (X; Z)

in (30), such that

X - \frac{E (X Z)}{E (Z^{2})} Z \to 0

. The Figure 1 describes our aim in a vector space.

Figure 1. Least p-variances based on the fixed vector

\vec{Z}

and the role of p from 0 to 1.

Figure 1. Least p-variances based on the fixed vector

\vec{Z}

and the role of p from 0 to 1.

In this figure,

{∥\vec{X}∥}_{2}^{2} = \sum_{k = 1}^{m} x_{k}^{2}

leads to the same as ordinary least squares for

p = 0

,

{∥\vec{X} - \vec{A}∥}_{2}^{2} = {∥\vec{X} - pro j_{\vec{Z}} \vec{X}∥}_{2}^{2}

leads to the complete type of least variances based on

\vec{Z}

and

p = 1

, and finally,

{∥\vec{X} - p \vec{A}∥}_{2}^{2}

leads to the least p-variances based on

\vec{Z}

and

p \in [0, 1]

.

3. p-Uncorrelatedness Condition on the p-Covariances Linear System and Its Consequences

Since the coefficients matrix of the system (18) is symmetric and all its diagonal entries are positive, the aforesaid system is solvable, and its unknown coefficients are computable. However, as we observed in the previous example, analytically solving such linear systems is difficult from a computational point of view. To resolve this problem, we can impose the condition

{cov}_{p} (X_{i}, X_{j}; Z) = {var}_{p} (X_{j}; Z) δ_{i, j} for every i, j = 0, 1, \dots, n with δ_{i, j} = \{\begin{matrix} 0 (i \neq j), \\ 1 (i = j), \end{matrix}

(32)

on the elements of the system (18) to easily obtain the unknown coefficients in the form

c_{k} = \frac{{cov}_{p} (X_{k}, Y; Z)}{{var}_{p} (X_{k}; Z)} .

In this case,

Y ≅ \sum_{k = 0}^{n} \frac{{cov}_{p} (X_{k}, Y; Z)}{{var}_{p} (X_{k}; Z)} X_{k},

(33)

would be the best approximation in the sense of least p-variance of the error with respect to the fixed variable Z.

Theorem 1.

Any finite set of random variables satisfying the p-uncorrelatedness condition (32) is linearly independent.

Proof.

Assume that

\sum_{k = 0}^{n} a_{k} X_{k} = 0

, where

{a_{k}}_{k = 0}^{n}

are not all zero, and

{X_{k}}_{k = 0}^{n}

are p-uncorrelated variables satisfying (32). Then, applying

{cov}_{p} (., X_{j}; Z)

on both sides of this assumption yields

0 = {cov}_{p} (\sum_{k = 0}^{n} a_{k} X_{k}, X_{j}; Z) = \sum_{k = 0}^{n} a_{k} {cov}_{p} (X_{k}, X_{j}; Z) = a_{j} {var}_{p} (X_{j}; Z),

which implies that

a_{j} = 0

for

j = 0, 1, \dots, n

, i.e., a contradiction. □

The question is now how to find linear combinations that are p-uncorrelated with respect to the variable Z. There is a basic theorem in this direction similar to the Gram–Schmidt orthogonalization theorem [16].

Theorem 2.

Let

{V_{k}}_{k = 0}^{n}

be a finite or infinite sequence of random variables, such that any finite number of elements,

{V_{k}}_{k = 0}^{n}

, are linearly independent. One can find constants

{a_{i, j}}

, such that the elements

\begin{matrix} X_{0} = V_{0}, \\ X_{1} = V_{1} + a_{12} V_{0}, \\ X_{2} = V_{2} + a_{22} V_{1} + a_{23} V_{0}, \\ ⋮ \\ X_{n} = V_{n} + a_{n 2} V_{n - 1} + \dots + a_{n, n + 1} V_{0}, \end{matrix}

(34)

are mutually p-uncorrelated with respect to the fixed variable Z

Proof.

Let us set, recursively,

\begin{matrix} X_{0} = V_{0}, \\ X_{1} = V_{1} - \frac{{cov}_{p} (X_{0}, V_{1}; Z)}{{var}_{p} (X_{0}; Z)} X_{0}, \\ ⋮ \\ X_{n} = V_{n} - \sum_{k = 0}^{n - 1} \frac{{cov}_{p} (X_{k}, V_{n}; Z)}{{var}_{p} (X_{k}; Z)} X_{k} . \end{matrix}

(35)

Since relations (35) show that

X_{n}

is a linear combination of

{V_{k}}_{k = 0}^{n}

, it is just enough to show that

X_{n}

is p-uncorrelated to

{X_{j}}_{j = 0}^{n - 1}

with respect to the variable Z. For

j = 0, 1, \dots, n - 1

we have

\begin{matrix} {cov}_{p} (X_{n}, X_{j}; Z) & = {cov}_{p} (V_{n} - \sum_{k = 0}^{n - 1} \frac{{cov}_{p} (X_{k}, V_{n}; Z)}{{var}_{p} (X_{k}; Z)} X_{k}, X_{j}; Z) \\ = {cov}_{p} (V_{n}, X_{j}; Z) - \sum_{k = 0}^{n - 1} \frac{{cov}_{p} (X_{k}, V_{n}; Z)}{{var}_{p} (X_{k}; Z)} {cov}_{p} (X_{k}, X_{j}; Z) \\ = {cov}_{p} (V_{n}, X_{j}; Z) - {cov}_{p} (X_{j}, V_{n}; Z) = 0 . □ \end{matrix}

In fact, Theorem 2 is a generalization of the Gram–Schmidt orthogonalization process when

p = 0

or the condition

E (Z V_{k}) = 0

holds for every

k = 0, 1, \dots, n

.

Theorem 3.

The reverse proposition of the previous theorem is as follows: There are constants,

{b_{i, j}}

, such that

\begin{matrix} V_{0} = X_{0}, \\ V_{1} = X_{1} + b_{12} X_{0}, \\ V_{2} = X_{2} + b_{22} X_{1} + b_{23} X_{0}, \\ ⋮ \\ V_{n} = X_{n} + b_{n 2} X_{n - 1} + \dots + b_{n, n + 1} X_{0}, \end{matrix}

(36)

and

{cov}_{p} (X_{n}, V_{k}; Z) = 0 for k = 0, 1, \dots, n - 1,

provided that

{cov}_{p} (X_{i}, X_{j}; Z) = 0 for i \neq j .

(37)

Proof.

By virtue of the fact that the system (34) is invertible, the general element of (36) in the form

V_{k} = X_{k} + b_{k 2} X_{k - 1} + \dots + b_{k, k + 1} X_{0},

yields

\begin{matrix} {cov}_{p} (X_{n}, V_{k}; Z) & = {cov}_{p} (X_{n}, X_{k} + \sum_{j = 2}^{k + 1} b_{k, j} X_{k + 1 - j}; Z) \\ = {cov}_{p} (X_{n}, X_{k}; Z) + \sum_{j = 2}^{k + 1} b_{k, j} {cov}_{p} (X_{n}, X_{k + 1 - j}; Z) = 0, \end{matrix}

which is valid for every

k = 0, 1, \dots, n - 1

, according to the condition (37). □

A General Representation for p-Uncorrelated Variables

Both previous theorems, Theorems 2 and 3, help us now obtain a general representation for p-uncorrelated variables with respect to the variable Z. Assume that

{V_{k}}_{k = 0}^{n}

is an arbitrary, independent sequence of random variables, and

{X_{k}}_{k = 0}^{n}

satisfy the condition (32) as before. Noting the last row of (35) and this fact from (34) that

X_{n} = \sum_{k = 0}^{n} a_{n, n + 1 - k} V_{k}

where

a_{n, 1} = 1

, for

m \leq n

, we have

\begin{matrix} {cov}_{p} (X_{n}, X_{m}; Z) = {cov}_{p} (X_{n}, V_{m} - \sum_{j = 0}^{m - 1} \frac{{cov}_{p} (X_{j}, V_{m}; Z)}{{var}_{p} (X_{j}; Z)} X_{j}; Z) \\ = {cov}_{p} (X_{n}, V_{m}; Z) - \sum_{j = 0}^{m - 1} \frac{{cov}_{p} (X_{j}, V_{m}; Z)}{{var}_{p} (X_{j}; Z)} {cov}_{p} (X_{n}, X_{j}; Z) = {cov}_{p} (V_{m}, X_{n}; Z) \\ = {cov}_{p} (V_{m}, \sum_{k = 0}^{n} a_{n, n + 1 - k} V_{k}; Z) = \sum_{k = 0}^{n} a_{n, n + 1 - k} {cov}_{p} (V_{m}, V_{k}; Z) = {var}_{p} (X_{n}; Z) δ_{m, n} . \end{matrix}

(38)

For

m = 0, 1, \dots, n

, relation (38) leads eventually to the linear system

\begin{matrix} [\begin{matrix} \begin{matrix} {var}_{p} (V_{0}; Z) \\ {cov}_{p} (V_{1}, V_{0}; Z) \\ ⋮ \\ {cov}_{p} (V_{n}, V_{0}; Z) \end{matrix} & \begin{matrix} {cov}_{p} (V_{0}, V_{1}; Z) \\ {var}_{p} (V_{1}; Z) \\ ⋮ \\ {cov}_{p} (V_{n}, V_{1}; Z) \end{matrix} & \begin{matrix} \dots \\ \dots \\ ⋮ \\ \dots \end{matrix} & \begin{matrix} {cov}_{p} (V_{0}, V_{n}; Z) \\ {cov}_{p} (V_{1}, V_{n}; Z) \\ ⋮ \\ {var}_{p} (V_{n}; Z) \end{matrix} \end{matrix}] [\begin{matrix} a_{n, n + 1} \\ a_{n, n} \\ ⋮ \\ a_{n, 1} \end{matrix}] \\ = [\begin{matrix} 0 \\ 0 \\ ⋮ \\ {var}_{p} (X_{n}; Z) \end{matrix}] . \end{matrix}

(39)

If the determinant

Δ_{n}^{(p)} ({V_{k}}_{k = 0}^{n}; Z) = |\begin{matrix} \begin{matrix} {var}_{p} (V_{0}; Z) \\ {cov}_{p} (V_{1}, V_{0}; Z) \\ ⋮ \\ {cov}_{p} (V_{n}, V_{0}; Z) \end{matrix} & \begin{matrix} {cov}_{p} (V_{0}, V_{1}; Z) \\ {var}_{p} (V_{1}; Z) \\ ⋮ \\ {cov}_{p} (V_{n}, V_{1}; Z) \end{matrix} & \begin{matrix} \dots \\ \dots \\ ⋮ \\ \dots \end{matrix} & \begin{matrix} {cov}_{p} (V_{0}, V_{n}; Z) \\ {cov}_{p} (V_{1}, V_{n}; Z) \\ ⋮ \\ {var}_{p} (V_{n}; Z) \end{matrix} \end{matrix}|,

(40)

is defined with

Δ_{- 1}^{(p)} (.) = 1

, the first result of solving the system (39) is that the value

{var}_{p} (X_{n}; Z)

can be computed in terms of the determinant (40) as

{var}_{p} (X_{n}; Z) = \frac{Δ_{n}^{(p)} ({V_{k}}_{k = 0}^{n}; Z)}{Δ_{n - 1}^{(p)} ({V_{k}}_{k = 0}^{n - 1}; Z)} .

(41)

On the other side, it follows from (41) as a first-order equation that

Δ_{n}^{(p)} ({V_{k}}_{k = 0}^{n}; Z) = \prod_{k = 0}^{n} {var}_{p} (X_{k}; Z) \geq 0 .

(42)

The second result of solving (39) is to derive a general representation for

X_{n}

as

\begin{matrix} Δ_{n - 1}^{(p)} ({V_{k}}_{k = 0}^{n - 1}; Z) X_{n} \\ = |\begin{matrix} \begin{matrix} \begin{matrix} {var}_{p} (V_{0}; Z) \\ {cov}_{p} (V_{1}, V_{0}; Z) \\ ⋮ \\ {cov}_{p} (V_{n - 1}, V_{0}; Z) \\ V_{0} \end{matrix} & \begin{matrix} {cov}_{p} (V_{0}, V_{1}; Z) \\ {var}_{p} (V_{1}; Z) \\ ⋮ \\ {cov}_{p} (V_{n - 1}, V_{1}; Z) \\ V_{1} \end{matrix} & \begin{matrix} \dots \\ \dots \\ ⋮ \\ \dots \\ \dots \end{matrix} & \begin{matrix} {cov}_{p} (V_{0}, V_{n}; Z) \\ {cov}_{p} (V_{1}, V_{n}; Z) \\ ⋮ \\ {cov}_{p} (V_{n - 1}, V_{n}; Z) \\ V_{n} \end{matrix} \end{matrix} \end{matrix}|, \end{matrix}

(43)

where we have exploited (41) to derive it.

Inequality (42) reveals that the determinant of the coefficients matrix of the system (18) is always non-negative. Moreover, if

p = 0

or

E (Z X_{k}) = 0

, for every

k = 0, 1, \dots, n

in (43), it gives the same as the Gram–Schmidt orthogonalization process. For instance, the expanded forms of (43) for

n = 0, 1, 2

are

\begin{matrix} X_{0} & = V_{0}, \\ X_{1} & = V_{1} - \frac{{cov}_{p} (V_{0}, V_{1}; Z)}{{var}_{p} (V_{0}; Z)} V_{0}, \\ X_{2} & = V_{2} - \frac{{var}_{p} (V_{0}; Z) {cov}_{p} (V_{1}, V_{2}; Z) - {cov}_{p} (V_{0}, V_{1}; Z) {cov}_{p} (V_{0}, V_{2}; Z)}{{var}_{p} (V_{0}; Z) {var}_{p} (V_{1}; Z) - {cov}_{p}^{2} (V_{0}, V_{1}; Z)} V_{1} \\ - \frac{{var}_{p} (V_{1}; Z) {cov}_{p} (V_{0}, V_{2}; Z) - {cov}_{p} (V_{0}, V_{1}; Z) {cov}_{p} (V_{1}, V_{2}; Z)}{{var}_{p} (V_{0}; Z) {var}_{p} (V_{1}; Z) - {cov}_{p}^{2} (V_{0}, V_{1}; Z)} V_{0}, \end{matrix}

satisfying the conditions

{cov}_{p} (X_{0}, X_{1}; Z) = {cov}_{p} (X_{0}, X_{2}; Z) = {cov}_{p} (X_{1}, X_{2}; Z) = 0 .

Notice that, although the general representation (43) is computationally slower than the recursive algorithm described in Theorem 2, it is of theoretical interest, as we have used it to find new sequences of p-uncorrelated functions (see examples of Section 7, Section 8, Section 9 and Section 10).

4. p-Uncorrelated Expansions with Respect to a Fixed Variable

Let

{X_{k}}_{k = 0}^{n}

be a sequence of p-uncorrelated variables with respect to the variable Z. If

n \to \infty

in approximation (33), the series

\sum_{k = 0}^{\infty} \frac{{cov}_{p} (X_{k}, Y; Z)}{{var}_{p} (X_{k}; Z)} X_{k},

is called a p-uncorrelated expansion with respect to Z for Y, and then we write

Y \sim \sum_{k = 0}^{\infty} \frac{{cov}_{p} (X_{k}, Y; Z)}{{var}_{p} (X_{k}; Z)} X_{k} .

(44)

In the case where n is finite, there is a separate theorem, as follows.

Theorem 4.

Let

{V_{k}}_{k = 0}^{n}

be linearly independent variables, and let

{X_{k}}_{k = 0}^{n}

be their corresponding p-uncorrelated elements generated via

{V_{k}}_{k = 0}^{n}

in Theorem 2. If

\sum_{k = 0}^{n} a_{k} V_{k} = W_{n}

, and then

W_{n} = \sum_{k = 0}^{n} \frac{{cov}_{p} (X_{k}, W_{n}; Z)}{{var}_{p} (X_{k}; Z)} X_{k} .

(45)

Proof.

First from Theorem 3 and relation (36), we have

\begin{matrix} W_{n} = \sum_{k = 0}^{n} a_{k} V_{k} = a_{0} (X_{0}) + a_{1} (X_{1} + b_{12} X_{0}) + \dots + a_{n} (X_{n} + b_{n 2} X_{n - 1} + \dots + b_{n, n + 1} X_{0}) \\ = c_{0} X_{0} + c_{1} X_{1} + \dots + c_{n} X_{n} . \end{matrix}

(46)

Now, for

k = 0, 1, \dots, n

, apply

{cov}_{p} (., X_{k}; Z)

on both sides of (46) to get

{cov}_{p} (W_{n}, X_{k}; Z) = {cov}_{p} (\sum_{j = 0}^{n} c_{j} X_{j}, X_{k}; Z) = \sum_{j = 0}^{n} c_{j} {cov}_{p} (X_{j}, X_{k}; Z) = c_{k} {var}_{p} (X_{k}; Z),

which yields

c_{k} = \frac{{cov}_{p} (W_{n}, X_{k}; Z)}{{var}_{p} (X_{k}; Z)},

and (45) is, therefore, proven. □

Remark 1.

There is an important point in Theorem 4. Let us reconsider

n + 1

mutually p-uncorrelated elements

{X_{k}}_{k = 0}^{n}

, satisfying the condition

{cov}_{p} (X_{i}, X_{j}; Z) = {var}_{p} (X_{j}; Z) δ_{i, j} .

(47)

Since

{cov}_{p} (X_{i}, X_{j}; Z) = E ((X_{i} - (1 - \sqrt{1 - p}) \frac{E (X_{i} Z)}{E (Z^{2})} Z) (X_{j} - (1 - \sqrt{1 - p}) \frac{E (X_{j} Z)}{E (Z^{2})} Z)),

by defining the variable

T_{j, p} (Z) = X_{j} - (1 - \sqrt{1 - p}) \frac{E (X_{j} Z)}{E (Z^{2})} Z = X_{j} - (1 - \sqrt{1 - p}) pro j_{Z} X_{j},

(48)

we observe in (47) that

E (T_{i, p} (Z) T_{j, p} (Z)) = E (T_{j, p}^{2} (Z)) δ_{i, j} .

(49)

Now notice that the orthogonal sequence

{\{T_{j, p} (Z)\}}_{j = 0}^{n}

in (48) is created only when we have already created the uncorrelated sequence

{\{X_{j}\}}_{j = 0}^{n}

because of the term

\frac{E (X_{j} Z)}{E (Z^{2})} Z

; otherwise, generating such orthogonal sequences is impossible. This means that uncorrelatedness would always lead to orthogonality, but its reverse is not possible when

{E (Z X_{k})}_{k = 0}^{n} \neq 0

. Also, note that the original shape of these two sequences is quite different, so their corresponding approximations are different, such that the approximation corresponding to the orthogonal base

{\{T_{j, p} (Z)\}}_{j = 0}^{n}

is optimized in the sense of ordinary least squares, while the approximation corresponding to the uncorrelated base

{\{X_{j}\}}_{j = 0}^{n}

is optimized in the sense of least p-variances. Referring to (48) and (49), let us now consider the following orthogonal approximation sum for the same variable

W_{n}

, as stated in (45),

S_{n} = \sum_{j = 0}^{n} \frac{E (W_{n} T_{j, p} (Z))}{E (T_{j, p}^{2} (Z))} T_{j, p} (Z) .

(50)

Since, in (50),

E (W_{n} T_{j, p} (Z)) = E (W_{n} (X_{j} - (1 - \sqrt{1 - p}) \frac{E (X_{j} Z)}{E (Z^{2})} Z)) = {cov}_{p} (X_{j}, W_{n}; Z),

and

E (T_{j, p}^{2} (Z)) = E ({(X_{j} - (1 - \sqrt{1 - p}) \frac{E (X_{j} Z)}{E (Z^{2})} Z)}^{2}) = {var}_{p} (X_{j}; Z),

the aforesaid sum is simplified as

\begin{matrix} S_{n} & = \sum_{j = 0}^{n} \frac{{cov}_{p} (X_{j}, W_{n}; Z)}{{var}_{p} (X_{j}; Z)} (X_{j} - (1 - \sqrt{1 - p}) \frac{E (X_{j} Z)}{E (Z^{2})} Z) \\ = W_{n} - (1 - \sqrt{1 - p}) \frac{Z}{E (Z^{2})} \sum_{j = 0}^{n} \frac{{cov}_{p} (X_{j}, W_{n}; Z) E (X_{j} Z)}{{var}_{p} (X_{j}; Z)} . \end{matrix}

(51)

The main point is now here: if, in (51),

p = 0

or

{\{E (X_{j} Z) = 0\}}_{j = 0}^{n}

(see Section 2.3.1) and/or the limit condition

\sum_{j = 0}^{\infty} \frac{{cov}_{p} (X_{j}, W_{n}; Z) E (X_{j} Z)}{{var}_{p} (X_{j}; Z)} = 0,

(52)

is satisfied, then

lim_{n \to \infty} S_{n} = lim_{n \to \infty} W_{n} = W^{*}

, and subsequently

\sum_{j = 0}^{\infty} \frac{{cov}_{p} (X_{j}, W^{*}; Z)}{{var}_{p} (X_{j}; Z)} X_{j} = \sum_{j = 0}^{\infty} \frac{E (W^{*} T_{j, p} (Z))}{E (T_{j, p}^{2} (Z))} T_{j, p} (Z) .

(53)

This means that the two aforesaid expansions in (53) coincide with each other only if the condition (52) holds. However, in many cases, this condition will not occur independently, as

W_{n}

is distinct for each choice, especially in continuous spaces; see Corollary 9 in this regard. We finally add that, in a general case, we have

\begin{matrix} \sum_{j = 0}^{n} C_{j} T_{j, p} (Z) & = \sum_{j = 0}^{n} C_{j} (X_{j} - (1 - \sqrt{1 - p}) \frac{E (X_{j} Z)}{E (Z^{2})} Z) \\ = \sum_{j = 0}^{n} C_{j} X_{j} - (1 - \sqrt{1 - p}) \frac{Z}{E (Z^{2})} E (Z \sum_{j = 0}^{n} C_{j} X_{j}) \\ = \sum_{j = 0}^{n} C_{j} X_{j} - (1 - \sqrt{1 - p}) pro j_{Z} (\sum_{j = 0}^{n} C_{j} X_{j}) . \end{matrix}

4.1. A Biorthogonality Property

Let

{X_{k}}_{k = 0}

be a sequence of complete uncorrelated variables satisfying the condition

{cov}_{1} (X_{k}, X_{j}; Z) = {var}_{1} (X_{j}; Z) δ_{k, j} .

Since

E (X_{k} (X_{j} - \frac{E (X_{j} Z)}{E (Z^{2})} Z)) = {cov}_{1} (X_{k}, X_{j}; Z),

so the relation

E (X_{k} (X_{j} - \frac{E (X_{j} Z)}{E (Z^{2})} Z)) = {var}_{1} (X_{j}; Z) δ_{k, j} .

implies that the sequence

{X_{k}}_{k = 0}

is biorthogonal with respect to the sequence

{\{X_{j} - \frac{E (X_{j} Z)}{E (Z^{2})} Z\}}_{j = 0}

. This means that every complete uncorrelated sequence can be biorthogonal, but its reverse proposition is not true. See also Corollary 7. To study the topic of biorthogonality, we refer the readers to [17,18].

Theorem 5.

Let

{X_{k}}_{k = 0}

be p-uncorrelated variables satisfying the condition (47), and let Y be arbitrary. Then,

{var}_{p} (Y - \sum_{k = 0}^{n} \frac{{cov}_{p} (X_{k}, Y; Z)}{{var}_{p} (X_{k}; Z)} X_{k}; Z) \leq {var}_{p} (Y - \sum_{k = 0}^{n} α_{k} X_{k}; Z),

(54)

for any selection of constants

{α_{k}}_{k = 0}^{n}

.

Proof.

According to (9), we first have

\begin{matrix} {var}_{p} (Y - \sum_{k = 0}^{n} α_{k} X_{k}; Z) = {var}_{p} (Y; Z) + {var}_{p} (\sum_{k = 0}^{n} α_{k} X_{k}; Z) - 2 {cov}_{p} (\sum_{k = 0}^{n} α_{k} X_{k}, Y; Z) \\ = {var}_{p} (Y; Z) + \sum_{k = 0}^{n} α_{k}^{2} {var}_{p} (X_{k}; Z) - 2 \sum_{k = 0}^{n} α_{k} {cov}_{p} (X_{k}, Y; Z) \\ = {var}_{p} (Y; Z) + \sum_{k = 0}^{n} {(α_{k} \sqrt{{var}_{p} (X_{k}; Z)} - \frac{{cov}_{p} (X_{k}, Y; Z)}{\sqrt{{var}_{p} (X_{k}; Z)}})}^{2} - \sum_{k = 0}^{n} \frac{{cov}_{p}^{2} (X_{k}, Y; Z)}{{var}_{p} (X_{k}; Z)}, \end{matrix}

(55)

in which the following identity has been used:

{var}_{p} (\sum_{k = 0}^{n} a_{k} X_{k}; Z) = \sum_{k = 0}^{n} a_{k}^{2} {var}_{p} (X_{k}; Z) \Leftrightarrow {cov}_{p} (X_{i}, X_{j}; Z) = 0 for i \neq j .

Since

{var}_{p} (Y; Z) - \sum_{k = 0}^{n} \frac{{cov}_{p}^{2} (X_{k}, Y; Z)}{{var}_{p} (X_{k}; Z)}

is independent of

{α_{k}}_{k = 0}^{n}

in the last equality of (55), the minimum of

{var}_{p} (Y - \sum_{k = 0}^{n} α_{k} X_{k}; Z)

is, therefore, achieved only when

α_{k} \sqrt{{var}_{p} (X_{k}; Z)} - \frac{{cov}_{p} (X_{k}, Y; Z)}{\sqrt{{var}_{p} (X_{k}; Z)}} = 0,

(56)

which results in the left side of inequality (54). □

With reference to (8), inequality (54) can be completed as follows

{var}_{p} (Y - \sum_{k = 0}^{n} \frac{{cov}_{p} (X_{k}, Y; Z)}{{var}_{p} (X_{k}; Z)} X_{k}; Z) \leq {var}_{p} (Y - \sum_{k = 0}^{n} α_{k} X_{k}; Z) \leq E ({(Y - \sum_{k = 0}^{n} α_{k} X_{k})}^{2}),

(57)

where

{α_{k}}_{k = 0}^{n}

are arbitrary.

Again, inequality (57) shows the superiority of p-uncorrelated approximations (33) with respect to any approximation that is made using the least square method (in particular orthogonal approximations). See Example 3 in this regard.

Corollary 1.

Let

{V_{k}}_{k = 0}^{n}

be an independent set of random variables. The problem of finding that linear combination of

V_{0}, V_{1}, \dots, V_{n}

, which minimizes

{var}_{p} (Y - \sum_{k = 0}^{n} β_{k} V_{k}; Z)

, for any selection of constants

{β_{k}}_{k = 0}^{n}

is solved using

\sum_{k = 0}^{n} \frac{{cov}_{p} (X_{k}, Y; Z)}{{var}_{p} (X_{k}; Z)} X_{k}

where

{X_{k}}_{k = 0}^{n}

are mutually p-uncorrelated with respect to the variable Z. This corollary tells us that every least p-variance problem with respect to a fixed variable is solved by an appropriate truncated p-uncorrelated expansion of type (44).

Corollary 2.

Substituting (56) into (55) gives

\begin{matrix} {var}_{p} (Y - \sum_{k = 0}^{n} \frac{{cov}_{p} (X_{k}, Y; Z)}{{var}_{p} (X_{k}; Z)} X_{k}; Z) = {var}_{p} (Y; Z) - \sum_{k = 0}^{n} \frac{{cov}_{p}^{2} (X_{k}, Y; Z)}{{var}_{p} (X_{k}; Z)} \\ = {var}_{p} (Y; Z) (1 - \sum_{k = 0}^{n} ρ_{p}^{2} (X_{k}, Y; Z)) . \end{matrix}

(58)

On the other hand, the positivity of a p-variance in (58) implies that

\sum_{k = 0}^{n} ρ_{p}^{2} (X_{k}, Y; Z) \leq 1,

where

{X_{k}}_{k = 0}^{n}

satisfy (47). Also, if

{X_{k}}_{k = 0}^{\infty}

is an infinite sequence of p-uncorrelated variables, then the latter inequality changes to

\sum_{k = 0}^{\infty} ρ_{p}^{2} (X_{k}, Y; Z) \leq 1 or \sum_{k = 0}^{\infty} \frac{{cov}_{p}^{2} (X_{k}, Y; Z)}{{var}_{p} (X_{k}; Z)} \leq {var}_{p} (Y; Z) .

(59)

Finally, convergence in the series (59) causes

lim_{k \to \infty} ρ_{p}^{2} (X_{k}, Y; Z) = 0 or lim_{k \to \infty} \frac{{cov}_{p}^{2} (X_{k}, Y; Z)}{{var}_{p} (X_{k}; Z)} = 0 .

Definition 5.

Convergence in p-variance with respect to a fixed variable

Reconsider the approximation (13), and form the p-variance of its error just like in relation (15). If

$lim_{n \to \infty} {var}_{p} (R (c_{0}, \dots, c_{n}); Z) = 0,$

(60)

we call the limit relation (60) “Convergence in p-variance with respect to the fixed variable Z”. In this sense, by noting identity (58) if

0 = lim_{n \to \infty} {var}_{p} (Y - \sum_{k = 0}^{n} \frac{{cov}_{p} (X_{k}, Y; Z)}{{var}_{p} (X_{k}; Z)} X_{k}; Z) = {var}_{p} (Y; Z) lim_{n \to \infty} (1 - \sum_{k = 0}^{n} ρ_{p}^{2} (X_{k}, Y; Z)),

and then

\sum_{k = 0}^{\infty} ρ_{p}^{2} (X_{k}, Y; Z) = 1 or \sum_{k = 0}^{\infty} \frac{{cov}_{p}^{2} (X_{k}, Y; Z)}{{var}_{p} (X_{k}; Z)} = {var}_{p} (Y; Z),

provided that

{X_{k}}_{k = 0}

are mutually p-uncorrelated.

Theorem 6.

Let

{V_{k}}_{k = 0}^{n}

be linearly independent variables, and let

{X_{k}}_{k = 0}^{n}

be their corresponding p-uncorrelated elements generated via

{V_{k}}_{k = 0}^{n}

in Theorem 2. Then, for any selection of constants

{λ_{k}}_{k = 0}^{n - 1}

, we have

{var}_{p} (X_{n}; Z) \leq {var}_{p} (V_{n} + \sum_{k = 0}^{n - 1} λ_{k} V_{k}; Z) .

(61)

Proof.

First, suppose in (54) that

n \to n - 1

, and then take

Y = V_{n}

. Noting the last equality of (35) gives

{var}_{p} (V_{n} - \sum_{k = 0}^{n - 1} \frac{{cov}_{p} (X_{k}, V_{n}; Z)}{{var}_{p} (X_{k}; Z)} X_{k}; Z) = {var}_{p} (X_{n}; Z) \leq {var}_{p} (V_{n} - \sum_{k = 0}^{n - 1} α_{k} X_{k}; Z) .

(62)

On the other hand, according to relations (34), there exists an arbitrary sequence, say

{λ_{k}}_{k = 0}^{n - 1}

, such that

- \sum_{k = 0}^{n - 1} α_{k} X_{k} = \sum_{k = 0}^{n - 1} λ_{k} V_{k}

in (62) which completes the proof of (61). □

Corollary 3.

Let

{X_{k}}_{k = 0}^{n}

be a p-uncorrelated variables with respect to the fixed variable Z. Then, for any variable Y and every

j = 0, 1, \dots, n

, we have

\begin{matrix} {cov}_{p} (Y - \sum_{k = 0}^{n} \frac{{cov}_{p} (X_{k}, Y; Z)}{{var}_{p} (X_{k}; Z)} X_{k}, X_{j}; Z) \\ = {cov}_{p} (Y, X_{j}; Z) - \sum_{k = 0}^{n} \frac{{cov}_{p} (X_{k}, Y; Z)}{{var}_{p} (X_{k}; Z)} {cov}_{p} (X_{k}, X_{j}; Z) \\ = {cov}_{p} (Y, X_{j}; Z) - {cov}_{p} (X_{j}, Y; Z) = 0 . \end{matrix}

Here is a suitable position for studying the p-uncorrelatedness topic practically and introducing new sequences of functions in continuous and discrete spaces.

5. p-Uncorrelated Functions with Respect to a Fixed Function

Let

{\{Φ_{k} (x)\}}_{k = 0}

be a sequence of continuous functions defined on

[a, b]

, and let

w (x)

be positive on this interval. Also let

z (x)

be a continuous function, such that

\int_{a}^{b} w (x) z^{2} (x) d x > 0

. Referring to (19), we say that

{\{Φ_{k} (x)\}}_{k = 0}^{\infty}

are (weighted) p-uncorrelated functions with respect to the fixed function

z (x)

if they satisfy the condition

\begin{matrix} \int_{a}^{b} w (x) Φ_{m} (x) Φ_{n} (x) d x - p \frac{\int_{a}^{b} w (x) Φ_{m} (x) z (x) d x \int_{a}^{b} w (x) Φ_{n} (x) z (x) d x}{\int_{a}^{b} w (x) z^{2} (x) d x} \\ = (\int_{a}^{b} w (x) Φ_{n}^{2} (x) d x - p \frac{{(\int_{a}^{b} w (x) Φ_{n} (x) z (x) d x)}^{2}}{\int_{a}^{b} w (x) z^{2} (x) d x}) δ_{m, n} . \end{matrix}

(63)

Especially when

w (x) z (x) = v (x)

, (63) becomes

\begin{matrix} \int_{a}^{b} w (x) Φ_{m} (x) Φ_{n} (x) d x - λ_{p}^{2} \int_{a}^{b} v (x) Φ_{m} (x) d x \int_{a}^{b} v (x) Φ_{n} (x) d x \\ = (\int_{a}^{b} w (x) Φ_{n}^{2} (x) d x - λ_{p}^{2} {(\int_{a}^{b} v (x) Φ_{n} (x) d x)}^{2}) δ_{m, n}, \end{matrix}

where

λ_{p}^{2} = p / \int_{a}^{b} \frac{v^{2} (x)}{w (x)} d x

(e.g., see relation (225)).

Noting (20), such a definition in (63) can similarly be considered for discrete functions as

$\begin{matrix} \sum_{x \in A^{*}} j (x) Φ_{m} (x) Φ_{n} (x) - p \frac{\sum_{x \in A^{*}} j (x) Φ_{m} (x) z (x) \sum_{x \in A^{*}} j (x) Φ_{n} (x) z (x)}{\sum_{x \in A^{*}} j (x) z^{2} (x)} \\ = (\sum_{x \in A^{*}} j (x) Φ_{n}^{2} (x) - p \frac{{(\sum_{x \in A^{*}} j (x) Φ_{n} (x) z (x))}^{2}}{\sum_{x \in A^{*}} j (x) z^{2} (x)}) δ_{m, n} . \end{matrix}$

Remark 2.

Relation (63) clarifies that, after deriving p-uncorrelated functions, one will be able to define a sequence of orthogonal functions in the form

Φ_{n} (x) - (1 - \sqrt{1 - p}) \frac{\int_{a}^{b} w (x) Φ_{n} (x) z (x) d x}{\int_{a}^{b} w (x) z^{2} (x) d x} z (x) = G_{n} (x; p),

(64)

having the orthogonality property

\begin{matrix} \int_{a}^{b} w (x) G_{m} (x; p) G_{n} (x; p) d x & = (\int_{a}^{b} w (x) G_{n}^{2} (x; p) d x) δ_{m, n} \\ = (\int_{a}^{b} w (x) d x) ({var}_{p} (Φ_{n} (x); z (x))) δ_{m, n}, \end{matrix}

whereas its reverse proposition is not feasible when

\int_{a}^{b} w (x) Φ_{n} (x) z (x) d x \neq 0

or

p \neq 0

in (64). Thus, p-uncorrelated functions cannot be directly derived from orthogonal sequences, as they are an extension of orthogonal functions for

p = 0

, while, from p-uncorrelated functions, the aforesaid aim is always possible.

There are essentially two classifications for deriving (weighted) p-uncorrelated functions with respect to a fixed function.

5.1. Two Classifications for Deriving p-Uncorrelated Functions

5.1.1. First Classification

If the weight function in (63) is non-classical, i.e., not belonging to the Pearson distributions family and its symmetric analogue [19,20], the procedure to derive p-uncorrelated functions will lead to the same as generalized Gram–Schmidt process presented in Theorem 2 or equivalently in the representation (43). In this sense, there are ten classical sequences of real polynomials [20] that are orthogonal with respect to the Pearson distributions family

W (\begin{matrix} \begin{matrix} d, & e \end{matrix} \\ a, b, c \end{matrix}| x) = exp (\int \frac{d x + e}{a x^{2} + b x + c} d x) (a, b, c, d, e \in R),

and its symmetric analogue [19]

W^{*} (\begin{matrix} r, & s \\ p, & q \end{matrix}| x) = exp (\int \frac{r x^{2} + s}{x (p x^{2} + q)} d x) (p, q, r, s \in R) .

Five of them are infinitely orthogonal with respect to particular cases of the two above-mentioned distributions and five other ones are finitely orthogonal which are limited to some parametric constraints. For other applications of orthogonal polynomials, see, e.g., [21,22,23,24]. Table 1 shows the main characteristics of these sequences, where

S_{n} (\begin{matrix} r, & s \\ p, & q \end{matrix}| x) = \sum_{k = 0}^{[n / 2]} (\begin{matrix} [n / 2] \\ k \end{matrix}) \prod_{i = 0}^{[n / 2] - (k + 1)} \frac{(2 i + {(- 1)}^{n + 1} + 2 [n / 2]) p + r}{(2 i + {(- 1)}^{n + 1} + 2) q + s} x^{n - 2 k} = S_{n} (x),

is a four-parametric sequence of symmetric orthogonal polynomials [19] that satisfies the second-order differential equation

x^{2} (p x^{2} + q) S_{n}^{″} (x) + x (r x^{2} + s) S_{n}^{'} (x) - (n (r + (n - 1) p) x^{2} + (1 - {(- 1)}^{n}) s / 2) S_{n} (x) = 0 .

5.1.2. Second Classification

If the weight function belongs to the family of distributions presented in Table 1, there is an analogue theorem below for p-uncorrelated functions similar to the well-known Sturm–Liouville theorem for orthogonal functions [20].

Theorem 7.

Let

a (x), b (x), u (x)

and

v (x)

be four given functions continuous on the interval

[a, b]

and

p \in [0, 1]

. If

w (x) = u (x) exp (\int \frac{b (x) - a^{'} (x)}{a (x)} d x),

is a positive function on

[a, b]

and the sequence

{\{Φ_{n} (x)\}}_{n = 0}^{\infty}

with the corresponding eigenvalues

{λ_{n}}

satisfies an integro-differential equation of the form

\begin{matrix} a (x) Φ_{n}^{″} (x) + b (x) Φ_{n}^{'} (x) + (λ_{n} u (x) & + v (x)) Φ_{n} (x) \\ = p (λ_{n} + A_{z}) u (x) z (x) \frac{\int_{a}^{b} w (x) Φ_{n} (x) z (x) d x}{\int_{a}^{b} w (x) z^{2} (x) d x}, \end{matrix}

(65)

such that the function

z (x)

satisfies the equation

\frac{a (x) z^{″} (x) + b (x) z^{'} (x) + v (x) z (x)}{u (x) z (x)} = A_{z},

(66)

where

A_{z}

is a constant value, and also

\begin{matrix} α_{1} Φ_{n} (a) + β_{1} Φ_{n}^{'} (a) = 0, \\ α_{2} Φ_{n} (b) + β_{2} Φ_{n}^{'} (b) = 0, \end{matrix}

(67)

are two initial conditions where

α_{1}, β_{1}, α_{2}

and

β_{2}

are given constants, then

{\{Φ_{n} (x)\}}_{n = 0}^{\infty}

are (weighted) p-uncorrelated functions with respect to the fixed function

z (x)

.

Proof.

For the sake of simplicity, we assume in (65) that

\frac{\int_{a}^{b} w (x) Φ_{n} (x) z (x) d x}{\int_{a}^{b} w (x) z^{2} (x) d x} = β_{n}^{*}

. Then, Equation (65) can be written in the self-adjoint form

{(r (x) Φ_{n}^{'} (x))}^{'} + (λ_{n} w (x) + q (x)) Φ_{n} (x) = p (λ_{n} + A_{z}) β_{n}^{*} w (x) z (x),

(68)

and for the index m as

{(r (x) Φ_{m}^{'} (x))}^{'} + (λ_{m} w (x) + q (x)) Φ_{m} (x) = p (λ_{m} + A_{z}) β_{m}^{*} w (x) z (x),

(69)

in which

r (x) = a (x) exp (\int \frac{b (x) - a^{'} (x)}{a (x)} d x),

and

q (x) = v (x) exp (\int \frac{b (x) - a^{'} (x)}{a (x)} d x) .

On multiplying by

Φ_{m} (x)

and

Φ_{n} (x)

in, respectively, (68) and (69) and subtracting, we get

\begin{matrix} {[r (x) (Φ_{m} (x) Φ_{n}^{'} (x) - Φ_{n} (x) Φ_{m}^{'} (x))]}_{a}^{b} + (λ_{n} - λ_{m}) \int_{a}^{b} w (x) Φ_{n} (x) Φ_{m} (x) d x \\ = p (λ_{n} + A_{z}) β_{n}^{*} \int_{a}^{b} w (x) Φ_{m} (x) z (x) d x - p (λ_{m} + A_{z}) β_{m}^{*} \int_{a}^{b} w (x) Φ_{n} (x) z (x) d x \\ = p (λ_{n} - λ_{m}) \frac{\int_{a}^{b} w (x) Φ_{n} (x) z (x) d x \int_{a}^{b} w (x) Φ_{m} (x) z (x) d x}{\int_{a}^{b} w (x) z^{2} (x) d x}, \end{matrix}

which results in

\begin{matrix} (λ_{n} - λ_{m}) (\int_{a}^{b} w (x) Φ_{n} (x) Φ_{m} (x) d x - p \frac{\int_{a}^{b} w (x) Φ_{n} (x) z (x) d x \int_{a}^{b} w (x) Φ_{m} (x) z (x) d x}{\int_{a}^{b} w (x) z^{2} (x) d x}) \\ = - {[r (x) (Φ_{m} (x) Φ_{n}^{'} (x) - Φ_{n} (x) Φ_{m}^{'} (x))]}_{a}^{b} = 0 \Leftrightarrow n \neq m . □ \end{matrix}

It is important to mention that the integro-differential equation (65) can be transformed into a fourth-order equation using (66). First of all, without a loss in generality, we can assume in (65) that

u (x) = 1

because it is just enough to divide both side of (65) by

u (x)

. In this case, substituting

z (x) = \frac{1}{p (λ_{n} + A_{z}) β_{n}^{*}} (a (x) Φ_{n}^{″} (x) + b (x) Φ_{n}^{'} (x) + (λ_{n} + v (x)) Φ_{n} (x))

into the equation

a (x) z^{″} (x) + b (x) z^{'} (x) + (v (x) - A_{z}) z (x) = 0,

yields

\begin{matrix} a^{2} (x) Φ_{n}^{(4)} (x) + 2 a (x) (a^{'} (x) + b (x)) Φ_{n}^{(3)} (x) + \\ (a (x) (a^{″} (x) + 2 b^{'} (x) + 2 v (x) + λ_{n} - A_{z}) + b (x) (a^{'} (x) + b (x))) Φ_{n}^{″} (x) \\ + (a (x) (b^{″} (x) + 2 v^{'} (x)) + b (x) (b^{'} (x) + 2 v (x) + λ_{n} - A_{z})) Φ_{n}^{'} (x) \\ + (a (x) v^{″} (x) + b (x) v^{'} (x) + v^{2} (x) + (λ_{n} - A_{z}) v (x) - A_{z} λ_{n}) Φ_{n} (x) = 0, \end{matrix}

having four independent basis solutions.

5.2. A Generalized Theorem with an Important Remark

5.2.1. An Extension of the Previous Theorem

This time, let us assume in the right side of (65) that the function

u (x) z (x) = h (x)

satisfies a second-order differential equation of the form

p (x) h^{″} (x) + q (x) h^{'} (x) + r (x) h (x) = 0,

instead of satisfying the Equation (66), where

p (x), q (x)

and

r (x)

are all known functions. Then, Equation (65) will be transformed into the fourth-order equation

\begin{matrix} p (x) {(a (x) Φ_{n}^{″} (x) + b (x) Φ_{n}^{'} (x) + (λ_{n} u (x) + v (x)) Φ_{n} (x))}^{″} \\ + q (x) {(a (x) Φ_{n}^{″} (x) + b (x) Φ_{n}^{'} (x) + (λ_{n} u (x) + v (x)) Φ_{n} (x))}^{'} \\ + r (x) (a (x) Φ_{n}^{″} (x) + b (x) Φ_{n}^{'} (x) + (λ_{n} u (x) + v (x)) Φ_{n} (x)) = 0, \end{matrix}

which is equivalent to

\begin{matrix} p (x) a (x) Φ_{n}^{(4)} (x) + (p (x) (2 a^{'} (x) + b (x)) + q (x) a (x)) Φ_{n}^{(3)} (x) \\ + (p (x) u (x) λ_{n} + p (x) (a^{″} (x) + 2 b^{'} (x) + v (x)) + q (x) (a^{'} (x) + b (x)) + r (x) a (x)) Φ_{n}^{″} (x) \\ + ((2 p (x) u^{'} (x) + q (x) u (x)) λ_{n} + p (x) (b^{″} (x) + 2 v^{'} (x)) + q (x) (b^{'} (x) + v (x)) + r (x) b (x)) Φ_{n}^{'} (x) \\ + ((p (x) u^{″} (x) + q (x) u^{'} (x) + r (x) u (x)) λ_{n} + p (x) v^{″} (x) + q (x) v^{'} (x) + r (x) v (x)) Φ_{n} (x) = 0 . \end{matrix}

(70)

Noting the above-mentioned comments, Equation (65) can now be generalized as

a (x) Φ_{n}^{″} (x) + b (x) Φ_{n}^{'} (x) + (λ_{n} u (x) + v (x)) Φ_{n} (x) = R_{n} (x),

whose self-adjoint form

{(r (x) Φ_{n}^{'} (x))}^{'} + (λ_{n} w (x) + q (x)) Φ_{n} (x) = \frac{w (x)}{u (x)} R_{n} (x),

yields

\begin{matrix} {[r (x) (Φ_{m} (x) Φ_{n}^{'} (x) - Φ_{n} (x) Φ_{m}^{'} (x))]}_{a}^{b} + (λ_{n} - λ_{m}) \int_{a}^{b} w (x) Φ_{n} (x) Φ_{m} (x) d x \\ = \int_{a}^{b} \frac{w (x)}{u (x)} (R_{n} (x) Φ_{m} (x) - R_{m} (x) Φ_{n} (x)) d x . \end{matrix}

(71)

Now, if in (71), we set

R_{n} (x) = p (λ_{n} + A_{z}) \frac{\int_{a}^{b} w (x) z (x) Φ_{n} (x) d x}{\int_{a}^{b} w (x) z^{2} (x) d x} u (x) z (x),

we directly get to Theorem 7.

5.2.2. Remark

The previous section shows that Equation (65) is, in fact, a non-homogeneous second-order differential equation whose general solution for

p = 1

reads as

Φ_{n} (x) = c_{1} y_{1, n} (x) + c_{2} y_{2, n} (x) + λ^{*} z (x) (c_{1}, c_{2}, λ^{*} \in R),

(72)

where

c_{1} y_{1, n} (x) + c_{2} y_{2, n} (x)

is the solution of the homogeneous equation

a (x) Φ_{n}^{″} (x) + b (x) Φ_{n}^{'} (x) + (λ_{n} p (x) + q (x)) Φ_{n} (x) = 0,

with two independent solutions,

y_{1, n} (x)

and

y_{2, n} (x)

, respectively.

Hence, substituting (72) into (65) for

p = 1

simplifies the general solution as

Φ_{n} (x) = c_{1} (y_{1, n} (x) - \frac{\int_{a}^{b} w (x) z (x) y_{1, n} (x) d x}{\int_{a}^{b} w (x) z (x) y_{2, n} (x) d x} y_{2, n} (x)) + λ^{*} z (x),

(73)

provided that

\int_{a}^{b} w (x) z (x) y_{2, n} (x) d x \neq 0 (\forall n \notin Z^{+})

. On the other hand, since, in (73),

\int_{a}^{b} Φ_{n} (x) w (x) z (x) d x = λ^{*} \int_{a}^{b} w (x) z^{2} (x) d x,

the uncorrelated condition (63) for

p = 1

is simplified as

\begin{matrix} \int_{a}^{b} w (x) Φ_{n} (x) Φ_{m} (x) d x - {λ^{*}}^{2} \int_{a}^{b} w (x) z^{2} (x) d x \\ = (\int_{a}^{b} w (x) Φ_{n}^{2} (x) d x - {λ^{*}}^{2} \int_{a}^{b} w (x) z^{2} (x) d x) δ_{m, n}, \end{matrix}

leading to the “simple type of p-uncorrelated variables” to be defined and studied in detail in the next Section 6. In this sense, note that

\begin{matrix} \frac{1}{c_{1}^{2}} (\int_{a}^{b} w (x) Φ_{n}^{2} (x) d x - {λ^{*}}^{2} \int_{a}^{b} w (x) z^{2} (x) d x) = \int_{a}^{b} w (x) y_{1, n}^{2} (x) d x \\ + \frac{{(\int_{a}^{b} w (x) z (x) y_{1, n} (x) d x)}^{2}}{{(\int_{a}^{b} w (x) z (x) y_{2, n} (x) d x)}^{2}} \int_{a}^{b} w (x) y_{2, n}^{2} (x) d x \\ - 2 \frac{\int_{a}^{b} w (x) z (x) y_{1, n} (x) d x}{\int_{a}^{b} w (x) z (x) y_{2, n} (x) d x} \int_{a}^{b} w (x) y_{1, n} (x) y_{2, n} (x) d x, \end{matrix}

i.e., the aforementioned 1-variance value is always independent of

λ^{*}

.

Example 1.

Consider the following integro-differential equation

Φ_{n}^{″} (x) + n^{2} Φ_{n} (x) = (n^{2} + A) z (x) \frac{\int_{0}^{π} Φ_{n} (x) z (x) d x}{\int_{0}^{π} z^{2} (x) d x},

(74)

in which

z (x)

satisfies the equation

\frac{z^{″} (x)}{z (x)} = A .

(75)

Noting (70), the equivalent form of (74), together with (75), is

Φ_{n}^{(4)} (x) + (n^{2} - A) Φ_{n}^{″} (x) - A n^{2} Φ_{n} (x) = (D^{2} + n^{2}) (D^{2} - A) Φ_{n} (x) = 0,

having the general solution

Φ_{n} (x) = c_{1} sin n x + c_{2} cos n x + λ^{*} z (x) where (D^{2} - A) z (x) = 0 .

(76)

According to (73), the solution (76) changes to

Φ_{n} (x) = c_{1} (sin n x - \frac{\int_{0}^{π} z (x) sin n x d x}{\int_{0}^{π} z (x) cos n x d x} cos n x) + λ^{*} z (x),

(77)

satisfying the relation

\int_{0}^{π} Φ_{n} (x) Φ_{m} (x) d x - {λ^{*}}^{2} \int_{0}^{π} z^{2} (x) d x = c_{1}^{2} \frac{π}{2} (1 + \frac{{(\int_{0}^{π} z (x) sin n x d x)}^{2}}{{(\int_{0}^{π} z (x) cos n x d x)}^{2}}) δ_{m, n} .

In this regard, notice that the two sub-sequences

Φ_{2 n} (x) = c_{1} (sin 2 n x - \frac{\int_{0}^{π} z (x) sin 2 n x d x}{\int_{0}^{π} z (x) cos 2 n x d x} cos 2 n x) + λ^{*} z (x),

and

Φ_{2 n + 1} (x) = c_{1} (sin (2 n + 1) x - \frac{\int_{0}^{π} z (x) sin (2 n + 1) x d x}{\int_{0}^{π} z (x) cos (2 n + 1) x d x} cos (2 n + 1) x) + λ^{*} z (x),

in (77) are automatically uncorrelated (without any constraint) with respect to every fixed function,

z (x)

, that satisfies (75).

In total, three cases can occur for the constant value A in (75).

(i) If $A = α^{2} (α \neq 0)$ , the solution of equation (75) is as $z (x) = a_{1} e^{α x} + a_{2} e^{- α x}$ , and subsequently, (76) takes the form

$Φ_{n} (x) = c_{1} sin n x + c_{2} cos n x + a_{1} e^{α x} + a_{2} e^{- α x} .$
(ii) If $A = - α^{2} (α \neq 0)$ , then $z (x) = b_{1} cos (α x) + b_{2} sin (α x)$ and

$Φ_{n} (x) = c_{1} sin n x + c_{2} cos n x + b_{1} cos (α x) + b_{2} sin (α x) .$
(iii) If $A = 0$ , then $z (x) = c_{3} x + c_{4}$ and

$Φ_{n} (x) = c_{1} sin n x + c_{2} cos n x + c_{3} x + c_{4} .$

For instance, if

z (x) = e^{x}

is chosen from the above mentioned options, the general solution of the equation

Φ_{n}^{″} (x) + n^{2} Φ_{n} (x) = \frac{2 (n^{2} + 1)}{e^{2 π} - 1} e^{x} \int_{0}^{π} Φ_{n} (x) e^{x} d x,

is, according to (77), as

Φ_{n} (x) = c_{1} (sin n x - \frac{\int_{0}^{π} e^{x} sin n x d x}{\int_{0}^{π} e^{x} cos n x d x} cos n x) + λ^{*} e^{x} .

(78)

On the other hand,

\int_{0}^{π} e^{x} sin n x d x = - \frac{n}{n^{2} + 1} ({(- 1)}^{n} e^{π} - 1) and \int_{0}^{π} e^{x} cos n x d x = \frac{1}{n^{2} + 1} ({(- 1)}^{n} e^{π} - 1),

imply that (78) takes the final form

Φ_{n} (x) = c_{1} (sin n x + n cos n x) + λ^{*} e^{x},

(79)

where

c_{1}, λ^{*} \in R

. Note that, since in (79),

Φ_{n} (0) = Φ_{n}^{'} (0) and Φ_{n} (π) = Φ_{n}^{'} (π),

so

{[Φ_{m} (x) Φ_{n}^{'} (x) - Φ_{n} (x) Φ_{m}^{'} (x)]}_{0}^{π} = 0 .

However, for other choices of

z (x)

, we may have to determine the values

c_{1}

and

λ^{*}

uniquely. Also, for

n = m

, we obtain

π {var}_{1} (Φ_{n} (x); e^{x}) = \int_{0}^{π} Φ_{n}^{2} (x) d x - {λ^{*}}^{2} \frac{e^{2 π} - 1}{2} = c_{1}^{2} \frac{π}{2} (n^{2} + 1) (\forall n \in N) .

Finally, according to Remark 2, the sequence

Φ_{n} (x) - λ^{*} e^{x} = c_{1} (\sin n x + n cos n x),

must be orthogonal with respect to the constant weight function on

[0, π]

, i.e.,

\begin{matrix} \int_{0}^{π} (\sin (n + 1) x + (n + 1) \cos (n + 1) x) (\sin (m + 1) x + (m + 1) \cos (m + 1) x) d x \\ = \frac{π}{2} (n^{2} + 2 n + 2) δ_{n, m}, \end{matrix}

in which

\sin (n + 1) x + (n + 1) \cos (n + 1) x = \sqrt{n^{2} + 2 n + 2} \sin ((n + 1) x + \arctan (n + 1)),

having the explicit roots

x_{k} = \frac{k π - arctan (n + 1)}{n + 1} for k = 1, 2, \dots, n + 1 .

The important Section 5.2.2 and its subsequent example imply that we study and investigate a specific type of p-uncorrelated variables in the next section.

6. Simple p-Uncorrelated Variables and Their Relationship with Orthogonal Sequences

Instead of applying previous statistical concepts, in this section, we focus on an inner product space and study a special case that extends the orthogonality concept and leads to defining a simple type of p-uncorrelated variables. As before, assuming that

〈u, v〉

indicates the inner product of two free elements and

〈u, u〉 = {∥u∥}^{2} \geq 0

denotes the norm square value, suppose that

{x_{k}}_{k = 0}^{n}

and y are arbitrary variables, and consider the approximation

y ≅ \sum_{k = 0}^{n} c_{k} x_{k},

(80)

together with its remaining term

R (c_{0}, \dots, c_{n}) = \sum_{k = 0}^{n} c_{k} x_{k} - y,

(81)

where

{c_{k}}_{k = 0}^{n}

are unknown coefficients to be appropriately determined.

It is well known that minimizing the value

{∥R (c_{0}, \dots, c_{n})∥}^{2} = 〈\sum_{k = 0}^{n} c_{k} x_{k} - y, \sum_{k = 0}^{n} c_{k} x_{k} - y〉,

(82)

will lead to the normal equations system [13,17],

[\begin{matrix} \begin{matrix} 〈x_{0}, x_{0}〉 \\ 〈x_{1}, x_{0}〉 \\ ⋮ \\ 〈x_{n}, x_{0}〉 \end{matrix} & \begin{matrix} 〈x_{0}, x_{1}〉 \\ 〈x_{1}, x_{1}〉 \\ ⋮ \\ 〈x_{n}, x_{1}〉 \end{matrix} & \begin{matrix} \dots \\ \dots \\ ⋮ \\ \dots \end{matrix} & \begin{matrix} 〈x_{0}, x_{n}〉 \\ 〈x_{1}, x_{n}〉 \\ ⋮ \\ 〈x_{n}, x_{n}〉 \end{matrix} \end{matrix}] [\begin{matrix} c_{0} \\ c_{1} \\ ⋮ \\ c_{n} \end{matrix}] = [\begin{matrix} 〈x_{0}, y〉 \\ 〈x_{1}, y〉 \\ ⋮ \\ 〈x_{n}, y〉 \end{matrix}] .

(83)

It is also known that applying the orthogonality condition

〈x_{m}, x_{n}〉 = 〈x_{n}, x_{n}〉 δ_{m, n},

(84)

on the components of the above linear system eventually gives the explicit solutions

{\{c_{k} = \frac{〈x_{k}, y〉}{〈x_{k}, x_{k}〉}\}}_{k = 0}^{n} .

Hence, the orthogonal approximation

y ≅ \sum_{k = 0}^{n} \frac{〈x_{k}, y〉}{〈x_{k}, x_{k}〉} x_{k},

(85)

is derived as the optimal case in the sense of ordinary least squares.

By repeating the steps (80) to (83), instead of considering the orthogonality condition (84), we consider the generalized condition

〈x_{m}, x_{n}〉 = \{\begin{matrix} λ (m \neq n), \\ 〈x_{n}, x_{n}〉 (m = n), \end{matrix}

(86)

where the non-negative parameter

λ

is independent of the indices

m, n

.

As we mentioned in the preceding sections, the random variables

{X_{k}}_{k = 0}^{n}

are said to be p-uncorrelated with respect to the fixed variable Z if, for every

m \neq n

:

{cov}_{p} (X_{m}, X_{n}; Z) = E (X_{m} X_{n}) - p \frac{E (X_{m} Z) E (X_{n} Z)}{E (Z^{2})} = 0 .

(87)

In addition to the condition (87), if we now add another condition as

E (X_{k} Z) = r^{2} E (Z^{2}) > 0 for every k = 0, 1, \dots, n,

(88)

where r is independent of k, then we say that

{X_{k}}_{k = 0}^{n}

are simple p-uncorrelated variables. In this case, both conditions (87) and (88) imply that

E (X_{m} X_{n}) = \{\begin{matrix} p r^{4} E (Z^{2}) (m \neq n), \\ E (X_{n}^{2}) (m = n) . \end{matrix}

(89)

Furthermore, replacing the extra condition (88) in the condition (87) for

p = r = 1

leads to the orthogonality condition

E ((X_{m} - Z) (X_{n} - Z)) = 0 .

Note in (89) that the parameter p has no effect on simple p-uncorrelated variables. Hence, throughout this section, we assume that

p = 1

. The definition (88) can similarly be applied in an inner product space. We say that

x_{k}

is a simple uncorrelated variable if the condition (86) holds, and also,

〈x_{k}, 1〉 = r^{2} 〈1, 1〉 > 0 for every k = 0, 1, \dots, n .

In this case, if the sequence

{y_{k} = x_{k} - r}_{k = 0}^{n}

is orthogonal, then, for every

m \neq n

the orthogonality condition

〈x_{m} - r, x_{n} - r〉 = 〈x_{m}, x_{n}〉 - r 〈x_{m}, 1〉 - r 〈1, x_{n}〉 + r^{2} 〈1, 1〉 = 〈x_{m}, x_{n}〉 - r^{2} 〈1, 1〉 = 0,

would be equivalent to (86) for

λ = r^{2} 〈1, 1〉

. Moreover, according to the Schwarz inequality,

{〈x_{m}, x_{n}〉}^{2} \leq 〈x_{m}, x_{m}〉 〈x_{n}, x_{n}〉

, (86) implies that

〈x_{m}, x_{m}〉 〈x_{n}, x_{n}〉 \geq λ^{2} \Leftrightarrow m \neq n

.

Noting the above-mentioned definition, now suppose that

{x_{k}}_{k = 0}^{n}

are simple uncorrelated variables, and then substitute (86) into the normal system (83) to arrive at

[\begin{matrix} \begin{matrix} 〈x_{0}, x_{0}〉 \\ λ \\ ⋮ \\ λ \end{matrix} & \begin{matrix} λ \\ 〈x_{1}, x_{1}〉 \\ ⋮ \\ λ \end{matrix} & \begin{matrix} \dots \\ \dots \\ ⋮ \\ \dots \end{matrix} & \begin{matrix} λ \\ λ \\ ⋮ \\ 〈x_{n}, x_{n}〉 \end{matrix} \end{matrix}] [\begin{matrix} c_{0} \\ c_{1} \\ ⋮ \\ c_{n} \end{matrix}] = [\begin{matrix} 〈x_{0}, y〉 \\ 〈x_{1}, y〉 \\ ⋮ \\ 〈x_{n}, y〉 \end{matrix}] .

(90)

An interesting point is that the above system is explicitly solvable. In fact, there is even a more general case that can be solved analytically according to the following theorem.

Theorem 8.

Given the real numbers

{\{a_{j j}\}}_{j = 0}^{n}, {\{λ_{j}\}}_{j = 0}^{n} and {\{b_{j}\}}_{j = 0}^{n}

, the general solution of the linear system

[\begin{matrix} \begin{matrix} a_{00} \\ λ_{0} \\ ⋮ \\ λ_{0} \end{matrix} & \begin{matrix} λ_{1} \\ a_{11} \\ ⋮ \\ λ_{1} \end{matrix} & \begin{matrix} \dots \\ \dots \\ ⋮ \\ \dots \end{matrix} & \begin{matrix} λ_{n} \\ λ_{n} \\ ⋮ \\ a_{n n} \end{matrix} \end{matrix}] [\begin{matrix} c_{0} \\ c_{1} \\ ⋮ \\ c_{n} \end{matrix}] = [\begin{matrix} b_{0} \\ b_{1} \\ ⋮ \\ b_{n} \end{matrix}],

(91)

is (for every

j = 0, 1, \dots, n

) given by

c_{j} = \frac{b_{j} (1 + \sum_{k = 0}^{n} \frac{λ_{k}}{a_{k k} - λ_{k}}) - \sum_{k = 0}^{n} \frac{λ_{k} b_{k}}{a_{k k} - λ_{k}}}{(a_{j j} - λ_{j}) (1 + \sum_{k = 0}^{n} \frac{λ_{k}}{a_{k k} - λ_{k}})} = \frac{b_{j} - S_{n} ({\{λ_{k}; a_{k k}, b_{k}\}}_{k = 0}^{n})}{a_{j j} - λ_{j}},

(92)

where

S_{n} ({\{λ_{k}; a_{k k}, b_{k}\}}_{k = 0}^{n}) = \frac{\sum_{k = 0}^{n} λ_{k} b_{k} \prod_{\binom{j = 0}{j \neq k}}^{n} (a_{j j} - λ_{j})}{\prod_{j = 0}^{n} (a_{j j} - λ_{j}) + \sum_{k = 0}^{n} λ_{k} \prod_{\binom{j = 0}{j \neq k}}^{n} (a_{j j} - λ_{j})} .

Proof.

First, the system (91) can be rewritten as follows:

\{\begin{matrix} \sum_{k = 0}^{n} λ_{k} c_{k} = b_{0} - (a_{00} - λ_{0}) c_{0}, \\ \sum_{k = 0}^{n} λ_{k} c_{k} = b_{1} - (a_{11} - λ_{1}) c_{1}, \\ ⋮ \\ \sum_{k = 0}^{n} λ_{k} c_{k} = b_{n} - (a_{n n} - λ_{n}) c_{n} . \end{matrix}

(93)

Comparing the first two rows of (93) yields

c_{1} = \frac{b_{1} - b_{0}}{a_{11} - λ_{1}} + \frac{a_{00} - λ_{0}}{a_{11} - λ_{1}} c_{0} .

As well, comparing the first row of (93) with other rows gives, generally,

c_{k} = \frac{b_{k} - b_{0}}{a_{k k} - λ_{k}} + \frac{a_{00} - λ_{0}}{a_{k k} - λ_{k}} c_{0} .

(94)

By replacing (94) in the first row of (93) as

\sum_{k = 0}^{n} λ_{k} (\frac{b_{k} - b_{0}}{a_{k k} - λ_{k}} + \frac{a_{00} - λ_{0}}{a_{k k} - λ_{k}} c_{0}) = b_{0} - (a_{00} - λ_{0}) c_{0},

we finally obtain

c_{0} = \frac{b_{0} - \sum_{k = 0}^{n} λ_{k} \frac{b_{k} - b_{0}}{a_{k k} - λ_{k}}}{a_{00} - λ_{0} + \sum_{k = 0}^{n} λ_{k} \frac{a_{00} - λ_{0}}{a_{k k} - λ_{k}}} = \frac{b_{0} - S_{n} ({\{λ_{k}; a_{k k}, b_{k}\}}_{k = 0}^{n})}{a_{00} - λ_{0}} .

Following the above-mentioned approach for other unknown coefficients,

{\{c_{j}\}}_{j = 1}^{n}

are derived as

c_{j} = \frac{b_{j} - \sum_{\binom{k = 0}{k \neq j}}^{n} λ_{k} \frac{b_{k} - b_{j}}{a_{k k} - λ_{k}}}{a_{j j} + \sum_{\binom{k = 0}{k \neq j}}^{n} λ_{k} \frac{a_{j j} - λ_{j}}{a_{k k} - λ_{k}}} = \frac{b_{j} - \sum_{k = 0}^{n} λ_{k} \frac{b_{k} - b_{j}}{a_{k k} - λ_{k}}}{a_{j j} - λ_{j} + \sum_{k = 0}^{n} λ_{k} \frac{a_{j j} - λ_{j}}{a_{k k} - λ_{k}}} = \frac{b_{j} - S_{n} ({\{λ_{k}; a_{k k}, b_{k}\}}_{k = 0}^{n})}{a_{j j} - λ_{j}} . □

A particular case of the system (91) arises when

λ_{0} = λ_{1} = \dots = λ_{n} = λ

. In this case, the general solution (92) is simplified as

c_{j} = \frac{b_{j} (1 + λ \sum_{k = 0}^{n} \frac{1}{a_{k k} - λ}) - λ \sum_{k = 0}^{n} \frac{b_{k}}{a_{k k} - λ}}{(a_{j j} - λ) (1 + λ \sum_{k = 0}^{n} \frac{1}{a_{k k} - λ})} = \frac{b_{j} - S_{n} (λ; {\{a_{k k}, b_{k}\}}_{k = 0}^{n})}{a_{j j} - λ},

(95)

where

S_{n} (λ; {\{a_{k k}, b_{k}\}}_{k = 0}^{n}) = \frac{λ \sum_{k = 0}^{n} b_{k} \prod_{\binom{j = 0}{j \neq k}}^{n} (a_{j j} - λ)}{\prod_{j = 0}^{n} (a_{j j} - λ) + λ \sum_{k = 0}^{n} \prod_{\binom{j = 0}{j \neq k}}^{n} (a_{j j} - λ)} .

(96)

From (95) and (96), one can conclude that

|\begin{matrix} \begin{matrix} a_{00} \\ λ \\ ⋮ \\ λ \end{matrix} & \begin{matrix} λ \\ a_{11} \\ ⋮ \\ λ \end{matrix} & \begin{matrix} \dots \\ \dots \\ ⋮ \\ \dots \end{matrix} & \begin{matrix} λ \\ λ \\ ⋮ \\ a_{n n} \end{matrix} \end{matrix}| = \prod_{j = 0}^{n} (a_{j j} - λ) (1 + λ \sum_{k = 0}^{n} \frac{1}{a_{k k} - λ}) .

We can now apply the solution of type (95) for the linear system (90).

Corollary 4.

If the condition (86) is satisfied for the elements of the approximation (80), then

y ≅ \sum_{j = 0}^{n} \frac{〈x_{j}, y〉 - S_{n} (λ)}{〈x_{j}, x_{j}〉 - λ} x_{j},

(97)

in which

S_{n} (λ) = S_{n} (λ; {\{〈x_{k}, x_{k}〉, 〈x_{k}, y〉\}}_{k = 0}^{n}) = \frac{λ \sum_{k = 0}^{n} \frac{〈x_{k}, y〉}{〈x_{k}, x_{k}〉 - λ}}{1 + λ \sum_{k = 0}^{n} \frac{1}{〈x_{k}, x_{k}〉 - λ}}

, is the best approximation in the sense of ordinary least squares, such that

{∥\sum_{j = 0}^{n} \frac{〈x_{j}, y〉 - S_{n} (λ)}{〈x_{j}, x_{j}〉 - λ} x_{j} - y∥}^{2} \leq {∥\sum_{j = 0}^{n} α_{j} x_{j} - y∥}^{2} (\forall {\{α_{j}\}}_{j = 0}^{n} \in R) .

When it is noted that (97) is a generalization of the orthogonal approximation (85) for

λ = 0

, substituting

n = 0, 1, 2

, respectively, gives

\begin{matrix} y ≅ \frac{〈x_{0}, y〉}{〈x_{0}, x_{0}〉} x_{0}, \\ y ≅ \frac{〈x_{0}, y〉 〈x_{1}, x_{1}〉 - λ 〈x_{1}, y〉}{〈x_{0}, x_{0}〉 〈x_{1}, x_{1}〉 - λ^{2}} x_{0} + \frac{〈x_{1}, y〉 〈x_{0}, x_{0}〉 - λ 〈x_{0}, y〉}{〈x_{0}, x_{0}〉 〈x_{1}, x_{1}〉 - λ^{2}} x_{1}, \end{matrix}

and

\begin{matrix} y ≅ \frac{〈x_{0}, y〉 〈x_{1}, x_{1}〉 〈x_{2}, x_{2}〉 - λ^{2} 〈x_{0} - x_{1} - x_{2}, y〉 - λ (〈x_{1}, y〉 〈x_{2}, x_{2}〉 + 〈x_{2}, y〉 〈x_{1}, x_{1}〉)}{〈x_{0}, x_{0}〉 〈x_{1}, x_{1}〉 〈x_{2}, x_{2}〉 - λ^{2} (〈x_{0}, x_{0}〉 + 〈x_{1}, x_{1}〉 + 〈x_{2}, x_{2}〉 - 2 λ)} x_{0} \\ + \frac{〈x_{1}, y〉 〈x_{0}, x_{0}〉 〈x_{2}, x_{2}〉 - λ^{2} 〈x_{1} - x_{0} - x_{2}, y〉 - λ (〈x_{0}, y〉 〈x_{2}, x_{2}〉 + 〈x_{2}, y〉 〈x_{0}, x_{0}〉)}{〈x_{0}, x_{0}〉 〈x_{1}, x_{1}〉 〈x_{2}, x_{2}〉 - λ^{2} (〈x_{0}, x_{0}〉 + 〈x_{1}, x_{1}〉 + 〈x_{2}, x_{2}〉 - 2 λ)} x_{1} \\ + \frac{〈x_{2}, y〉 〈x_{0}, x_{0}〉 〈x_{1}, x_{1}〉 - λ^{2} 〈x_{2} - x_{1} - x_{0}, y〉 - λ (〈x_{0}, y〉 〈x_{1}, x_{1}〉 + 〈x_{1}, y〉 〈x_{0}, x_{0}〉)}{〈x_{0}, x_{0}〉 〈x_{1}, x_{1}〉 〈x_{2}, x_{2}〉 - λ^{2} (〈x_{0}, x_{0}〉 + 〈x_{1}, x_{1}〉 + 〈x_{2}, x_{2}〉 - 2 λ)} x_{2} . \end{matrix}

The question is now how to find the elements

{x_{k}}_{k = 0}^{n}

, such that they satisfy the generalized condition (86).

Theorem 9.

Let

{v_{k}}_{k = 0}

be a finite or infinite sequence of variables defined in an inner product space, such that any finite number of elements

{v_{k}}_{k = 0}^{n}

are linearly independent. One can find constants

{a_{i, j}}

, such that the elements

\begin{matrix} x_{0} = v_{0}, \\ x_{1} = v_{1} + a_{12} v_{0}, \\ x_{2} = v_{2} + a_{22} v_{1} + a_{23} v_{0}, \\ ⋮ \\ x_{n} = v_{n} + a_{n 2} v_{n - 1} + \dots + a_{n, n + 1} v_{0}, \end{matrix}

(98)

satisfy the condition (86).

Proof.

Referring to (96), we first assume, only for simplicity, that

\begin{matrix} S_{n - 1}^{*} (λ) & = λ - S_{n - 1} (λ; {\{〈x_{k}, x_{k}〉, λ - 〈x_{k}, v_{n}〉\}}_{k = 0}^{n - 1}) \\ = λ (1 + \sum_{k = 0}^{n - 1} \frac{〈x_{k}, v_{n}〉}{〈x_{k}, x_{k}〉 - λ}) / (1 + \sum_{k = 0}^{n - 1} \frac{λ}{〈x_{k}, x_{k}〉 - λ}), \end{matrix}

(99)

with

S_{- 1}^{*} (λ) = λ

. Then, set recursively,

\begin{matrix} x_{0} = v_{0}, \\ x_{1} = v_{1} - \frac{〈x_{0}, v_{1}〉 - S_{0}^{*} (λ)}{〈x_{0}, x_{0}〉 - λ} x_{0}, \\ ⋮ \\ x_{n} = v_{n} - \sum_{k = 0}^{n - 1} \frac{〈x_{k}, v_{n}〉 - S_{n - 1}^{*} (λ)}{〈x_{k}, x_{k}〉 - λ} x_{k} . \end{matrix}

(100)

Since the final shape of

x_{n}

is a linear combination of

{v_{k}}_{k = 0}^{n}

in (100), it is enough to show that

〈x_{n}, x_{j}〉 = λ

for every

j = 0, 1, \dots, n - 1

. For this purpose, applying

〈., x_{j}〉

on both sides of the last equality of (100) gives

\begin{matrix} λ = 〈x_{n}, x_{j}〉 = 〈v_{n} - \sum_{k = 0}^{n - 1} \frac{〈x_{k}, v_{n}〉 - S_{n - 1}^{*} (λ)}{〈x_{k}, x_{k}〉 - λ} x_{k}, x_{j}〉 \\ = 〈v_{n}, x_{j}〉 - \sum_{k = 0}^{n - 1} \frac{〈x_{k}, v_{n}〉 - S_{n - 1}^{*} (λ)}{〈x_{k}, x_{k}〉 - λ} 〈x_{k}, x_{j}〉 . \end{matrix}

Therefore, we have to prove that

\sum_{k = 0}^{n - 1} \frac{〈x_{k}, v_{n}〉 - S_{n - 1}^{*} (λ)}{〈x_{k}, x_{k}〉 - λ} 〈x_{k}, x_{j}〉 = 〈v_{n}, x_{j}〉 - λ .

(101)

We first simplify the left side of equality (101) as follows:

\begin{matrix} λ \sum_{\binom{k = 0}{k \neq j}}^{n - 1} \frac{〈x_{k}, v_{n}〉 - S_{n - 1}^{*} (λ)}{〈x_{k}, x_{k}〉 - λ} + \frac{〈x_{j}, v_{n}〉 - S_{n - 1}^{*} (λ)}{〈x_{j}, x_{j}〉 - λ} 〈x_{j}, x_{j}〉 \\ = λ \sum_{k = 0}^{n - 1} \frac{〈x_{k}, v_{n}〉 - S_{n - 1}^{*} (λ)}{〈x_{k}, x_{k}〉 - λ} - λ \frac{〈x_{j}, v_{n}〉 - S_{n - 1}^{*} (λ)}{〈x_{j}, x_{j}〉 - λ} + \frac{〈x_{j}, v_{n}〉 - S_{n - 1}^{*} (λ)}{〈x_{j}, x_{j}〉 - λ} 〈x_{j}, x_{j}〉 \\ = λ \sum_{k = 0}^{n - 1} \frac{〈x_{k}, v_{n}〉 - S_{n - 1}^{*} (λ)}{〈x_{k}, x_{k}〉 - λ} + 〈x_{j}, v_{n}〉 - S_{n - 1}^{*} (λ) . \end{matrix}

(102)

It is now sufficient to use (99) and replace

λ (1 + \sum_{k = 0}^{n - 1} \frac{〈x_{k}, v_{n}〉}{〈x_{k}, x_{k}〉 - λ}) = (1 + \sum_{k = 0}^{n - 1} \frac{λ}{〈x_{k}, x_{k}〉 - λ}) S_{n - 1}^{*} (λ),

in the last equality of (102) to complete the proof of (101). □

Here, one may ask how we could find relations (100). The answer is as follows:

Since

{v_{k}}_{k = 0}^{n}

are assumed to be linearly independent, there are constants

{q_{k}}_{k = 0}^{n - 1}

, such that the general element in (98) can be rewritten as

x_{n} = v_{n} + \sum_{k = 0}^{n - 1} q_{k} x_{k}

. For

j = 0, 1, \dots, n - 1

, apply

〈., x_{j}〉

on both sides of the later assumption to get

〈x_{n}, x_{j}〉 = 〈v_{n}, x_{j}〉 + \sum_{k = 0}^{n - 1} q_{k} 〈x_{k}, x_{j}〉,

which leads to the familiar linear system:

[\begin{matrix} \begin{matrix} 〈x_{0}, x_{0}〉 \\ λ \\ ⋮ \\ λ \end{matrix} & \begin{matrix} λ \\ 〈x_{1}, x_{1}〉 \\ ⋮ \\ λ \end{matrix} & \begin{matrix} \dots \\ \dots \\ ⋮ \\ \dots \end{matrix} & \begin{matrix} λ \\ λ \\ ⋮ \\ 〈x_{n - 1}, x_{n - 1}〉 \end{matrix} \end{matrix}] [\begin{matrix} q_{0} \\ q_{1} \\ ⋮ \\ q_{n - 1} \end{matrix}] = [\begin{matrix} λ - 〈x_{0}, v_{n}〉 \\ λ - 〈x_{1}, v_{n}〉 \\ ⋮ \\ λ - 〈x_{n - 1}, v_{n}〉 \end{matrix}],

having the explicit solutions

q_{k} = - \frac{〈x_{k}, v_{n}〉 - S_{n - 1}^{*} (λ)}{〈x_{k}, x_{k}〉 - λ} .

Note that the procedure will be simplified if the initial data are orthogonal. In other words, let

{v_{k}}_{k = 0}^{n}

be mutually orthogonal, i.e.,

〈v_{n}, v_{m}〉 = 〈v_{n}, v_{n}〉 δ_{n, m}

. Also, let

{x_{k}}_{k = 0}^{n}

satisfy the condition (86) as before. Noting the last row of (100) and this fact from (98) that

x_{n} = \sum_{k = 0}^{n} a_{n, n + 1 - k} v_{k}

where

a_{n, 1} = 1

, for

m \leq n

, we have

\begin{matrix} 〈x_{n}, x_{m}〉 = 〈x_{n}, v_{m} - \sum_{j = 0}^{m - 1} \frac{〈x_{j}, v_{m}〉 - S_{m - 1}^{*} (λ)}{〈x_{j}, x_{j}〉 - λ} x_{j}〉 \\ = 〈x_{n}, v_{m}〉 - λ \sum_{j = 0}^{m - 1} \frac{〈x_{j}, v_{m}〉 - S_{m - 1}^{*} (λ)}{〈x_{j}, x_{j}〉 - λ} = 〈v_{m}, \sum_{k = 0}^{n} a_{n, n + 1 - k} v_{k}〉 + λ - S_{m - 1}^{*} (λ) \\ = \sum_{k = 0}^{n} a_{n, n + 1 - k} 〈v_{m}, v_{k}〉 + λ - S_{m - 1}^{*} (λ) = \{\begin{matrix} λ (m \neq n), \\ 〈x_{n}, x_{n}〉 (m = n), \end{matrix} \end{matrix}

(103)

where we have used the following equality from (99):

- λ \sum_{j = 0}^{m - 1} \frac{〈x_{j}, v_{m}〉 - S_{m - 1}^{*} (λ)}{〈x_{j}, x_{j}〉 - λ} = λ - S_{m - 1}^{*} (λ) .

For

m = 0, 1, \dots, n

, relation (103) leads eventually to the linear system

[\begin{matrix} \begin{matrix} 〈v_{0}, v_{0}〉 \\ 〈v_{1}, v_{0}〉 \\ ⋮ \\ 〈v_{n}, v_{0}〉 \end{matrix} & \begin{matrix} 〈v_{0}, v_{1}〉 \\ 〈v_{1}, v_{1}〉 \\ ⋮ \\ 〈v_{n}, v_{1}〉 \end{matrix} & \begin{matrix} \dots \\ \dots \\ ⋮ \\ \dots \end{matrix} & \begin{matrix} 〈v_{0}, v_{n}〉 \\ 〈v_{1}, v_{n}〉 \\ ⋮ \\ 〈v_{n}, v_{n}〉 \end{matrix} \end{matrix}] [\begin{matrix} a_{n, n + 1} \\ a_{n, n} \\ ⋮ \\ a_{n, 1} \end{matrix}] = [\begin{matrix} λ \\ S_{0}^{*} (λ) \\ ⋮ \\ \begin{matrix} S_{n - 2}^{*} (λ) \\ 〈x_{n}, x_{n}〉 - λ + S_{n - 1}^{*} (λ) \end{matrix} \end{matrix}] .

(104)

Now, applying the orthogonality condition

〈v_{n}, v_{m}〉 = 〈v_{n}, v_{n}〉 δ_{n, m}

on the components of the coefficients matrix (104) gives

a_{n, n + 1 - k} = \frac{S_{k - 1}^{*} (λ)}{〈v_{k}, v_{k}〉} and 〈x_{n}, x_{n}〉 = 〈v_{n}, v_{n}〉 + λ - S_{n - 1}^{*} (λ) .

Noting the identity

λ - S_{m - 1}^{*} (λ) = - λ (\sum_{k = 0}^{n - 1} \frac{〈x_{k}, v_{n}〉 - λ}{〈x_{k}, x_{k}〉 - λ}) / (1 + \sum_{k = 0}^{n - 1} \frac{λ}{〈x_{k}, x_{k}〉 - λ}),

we respectively obtain

x_{n} = v_{n} + \sum_{m = 0}^{n - 1} \frac{λ}{〈v_{m}, v_{m}〉} \frac{1 + \sum_{j = 0}^{m - 1} \frac{〈x_{j}, v_{m}〉}{〈x_{j}, x_{j}〉 - λ}}{1 + \sum_{j = 0}^{m - 1} \frac{λ}{〈x_{j}, x_{j}〉 - λ}} v_{m},

(105)

and

〈x_{n}, x_{n}〉 = 〈v_{n}, v_{n}〉 - λ (\sum_{k = 0}^{n - 1} \frac{〈x_{k}, v_{n}〉 - λ}{〈x_{k}, x_{k}〉 - λ}) / (1 + \sum_{k = 0}^{n - 1} \frac{λ}{〈x_{k}, x_{k}〉 - λ}) .

On the other hand, for every

j = 0, 1, \dots, m - 1

, since

〈x_{j}, v_{m}〉 = 〈\sum_{k = 0}^{j} p_{k} v_{k}, v_{m}〉 = \sum_{k = 0}^{j} p_{k} 〈v_{k}, v_{m}〉 = 0,

so

x_{n} = v_{n} + \sum_{m = 0}^{n - 1} \frac{λ}{〈v_{m}, v_{m}〉 (1 + \sum_{j = 0}^{m - 1} \frac{λ}{〈x_{j}, x_{j}〉 - λ})} v_{m},

and

\frac{λ}{1 + \sum_{k = 0}^{n - 1} \frac{λ}{〈x_{k}, x_{k}〉 - λ}} = 〈v_{n}, v_{n}〉 + λ - 〈x_{n}, x_{n}〉,

which finally simplifies (105) as

x_{n} = v_{n} + \sum_{m = 0}^{n - 1} \frac{〈v_{m}, v_{m}〉 + λ - 〈x_{m}, x_{m}〉}{〈v_{m}, v_{m}〉} v_{m} .

(106)

There is a simple recurrence relation for the final result (106) if

n \to n + 1

. We have

x_{n + 1} - v_{n + 1} = x_{n} - v_{n} + \frac{〈v_{n}, v_{n}〉 + λ - 〈x_{n}, x_{n}〉}{〈v_{n}, v_{n}〉} v_{n},

which can be rewritten as

x_{n + 1} = x_{n} + v_{n + 1} + \frac{λ - 〈x_{n}, x_{n}〉}{〈v_{n}, v_{n}〉} v_{n} .

(107)

For instance, the expanded forms of (106) or, equivalently, (107) for

n = 0, 1, 2, 3

are

\begin{matrix} x_{0} = v_{0}, \\ x_{1} = v_{1} + λ \frac{v_{0}}{〈v_{0}, v_{0}〉}, \\ x_{2} = v_{2} + λ (1 - \frac{λ}{〈v_{0}, v_{0}〉}) \frac{v_{1}}{〈v_{1}, v_{1}〉} + λ \frac{v_{0}}{〈v_{0}, v_{0}〉}, \end{matrix}

(108)

and

x_{3} = v_{3} + λ (1 - \frac{λ}{〈v_{0}, v_{0}〉} - \frac{λ {(〈v_{0}, v_{0}〉 - λ)}^{2}}{{〈v_{0}, v_{0}〉}^{2} 〈v_{1}, v_{1}〉}) \frac{v_{2}}{〈v_{2}, v_{2}〉} + λ (1 - \frac{λ}{〈v_{0}, v_{0}〉}) \frac{v_{1}}{〈v_{1}, v_{1}〉} + λ \frac{v_{0}}{〈v_{0}, v_{0}〉},

satisfying the conditions

〈x_{0}, x_{1}〉 = 〈x_{0}, x_{2}〉 = 〈x_{0}, x_{3}〉 = 〈x_{1}, x_{2}〉 = 〈x_{1}, x_{3}〉 = 〈x_{2}, x_{3}〉 = λ .

Let us consider an illustrative example here. The shifted Legendre polynomials

P_{n, +} (x) = \sum_{k = 0}^{n} \frac{{(- n)}_{k} {(n + 1)}_{k}}{{(1)}_{k}} \frac{x^{k}}{k!},

where

{(a)}_{k} = a (a + 1) \dots (a + k - 1),

satisfy the orthogonal relation

\int_{0}^{1} P_{n, +} (x) P_{m, +} (x) d x = \frac{1}{2 n + 1} δ_{n, m} .

Since

P_{0, +} (x) = 1, P_{1, +} (x) = - 2 x + 1, P_{2, +} (x) = 6 x^{2} - 6 x + 1,

and

P_{3, +} (x) = - 20 x^{3} + 30 x^{2} - 12 x + 1

, replacing them in (108) eventually yields

\begin{matrix} P_{0, +} (x; λ) = 1, \\ P_{1, +} (x; λ) = - 2 x + λ + 1, \\ P_{2, +} (x; λ) = 6 x^{2} + 6 (λ^{2} - λ - 1) x - 3 λ^{2} + 4 λ + 1, \end{matrix}

and

\begin{matrix} P_{3, +} (x; λ) = - 20 x^{3} + 30 (- 3 λ^{4} + 6 λ^{3} - 4 λ^{2} + λ + 1) x^{2} \\ + 6 (15 λ^{4} - 30 λ^{3} + 21 λ^{2} - 6 λ - 2) x - 15 λ^{4} + 30 λ^{3} - 23 λ^{2} + 9 λ + 1, \end{matrix}

satisfying the simple uncorrelatedness condition

\int_{0}^{1} P_{n, +} (x; λ) P_{m, +} (x; λ) d x = λ \Leftrightarrow m \neq n and m, n = 0, 1, 2, 3 .

Remark 3.

A very obvious case is when

λ = 〈v_{0}, v_{0}〉

as (106) or (107) takes the simple form

x_{n} = v_{n} + v_{0}

and, for

m \neq n

, we therefore have

〈x_{n}, x_{m}〉 = 〈v_{n} + v_{0}, v_{m} + v_{0}〉 = 〈v_{n}, v_{m}〉 + 〈v_{n}, v_{0}〉 + 〈v_{0}, v_{m}〉 + 〈v_{0}, v_{0}〉 = 〈v_{0}, v_{0}〉 .

In this sense, for the above-mentioned example, since

λ = 〈P_{0, +} (x), P_{0, +} (x)〉 = 1

, so

P_{n, +} (x; 1) = P_{n, +} (x) + 1 .

6.1. A Biorthogonality Property

Reconsider the condition (86), and let

x_{r}

be a fixed variable. Since

〈x_{m}, x_{n}〉 = λ for m \neq n and 〈x_{m}, x_{r}〉 = λ for m \neq r,

it follows that

〈x_{m}, x_{n} - x_{r}〉 = 0 for m \neq n, r

. This means that the sequence

{x_{m}}_{m = 0}

is biorthogonal with respect to the sequence

{\{x_{n}^{*} = x_{n} - x_{r}\}}_{n = 0}

, particularly when

r = 0

.

6.2. A Basic Remark

Theorem 9 and analytically solving the linear system (91) in Theorem 8 show that we can apply the aforesaid approach for any functional type of linear approximations. In other words, consider the approximation

f (x) ≅ \sum_{k = 0}^{n} c_{k} Φ_{k} (x),

(109)

in which

{c_{k}}_{k = 0}^{n}

are unknown coefficients and

{Φ_{k} (x)}_{k = 0}^{n}

are known basis functions. Applying the linear functionals

{L_{j}}_{j = 0}^{n}

on both sides of (109) one by one will lead to the following linear system:

[\begin{matrix} \begin{matrix} L_{0} (Φ_{0} (x)) \\ L_{1} (Φ_{0} (x)) \\ ⋮ \\ L_{n} (Φ_{0} (x)) \end{matrix} & \begin{matrix} L_{0} (Φ_{1} (x)) \\ L_{1} (Φ_{1} (x)) \\ ⋮ \\ L_{n} (Φ_{1} (x)) \end{matrix} & \begin{matrix} \dots \\ \dots \\ ⋮ \\ \dots \end{matrix} & \begin{matrix} L_{0} (Φ_{n} (x)) \\ L_{1} (Φ_{n} (x)) \\ ⋮ \\ L_{n} (Φ_{n} (x)) \end{matrix} \end{matrix}] [\begin{matrix} c_{0} \\ c_{1} \\ ⋮ \\ c_{n} \end{matrix}] = [\begin{matrix} L_{0} (f (x)) \\ L_{1} (f (x)) \\ ⋮ \\ L_{n} (f (x)) \end{matrix}] .

Now, if the uncorrelatedness condition

L_{m} (Φ_{n} (x)) = \{\begin{matrix} λ \geq 0 (m \neq n), \\ L_{n} (Φ_{n} (x)) (m = n), \end{matrix}

is employed on the components of the above linear system, we will reach the same solution as in Theorem 8. The concepts presented in the preceding subsections help us now derive some orthogonal sequences of continuous functions in the next section.

6.3. Simple Uncorrelated Functions of Classical Type

Noting Theorem 7, again, let

a (x), b (x), u (x)

and

v (x)

be four given functions continuous on the interval

[a, b]

, and let

{λ_{n}}

be a sequence of eigenvalues. A Sturm–Liouville problem is a second-order linear differential equation of the form

a (x) y_{n}^{″} (x) + b (x) y_{n}^{'} (x) + (λ_{n} u (x) + v (x)) y_{n} (x) = L (y_{n}) + λ_{n} w (x) y_{n} = 0,

(110)

where

L (y_{n}) = \frac{d}{d x} (r (x) y_{n}^{'} (x)) + q (x) y_{n} (x),

in which

r (x) = a (x) exp (\int \frac{b (x) - a^{'} (x)}{a (x)} d x), q (x) = v (x) exp (\int \frac{b (x) - a^{'} (x)}{a (x)} d x),

and

w (x) = u (x) exp (\int \frac{b (x) - a^{'} (x)}{a (x)} d x) > 0, \forall x \in (a, b)

.

Equation (110) should be considered on an open interval, say

(a, b)

, with boundary conditions

α_{1} y_{n} (a) + β_{1} y_{n}^{'} (a) = 0

and

α_{2} y_{n} (b) + β_{2} y_{n}^{'} (b) = 0

, where

α_{1}, α_{2}

and

β_{1}, β_{2}

are given constants and

r (x), r^{'} (x), q (x)

and

w (x)

are to be assumed continuous for

x \in [a, b]

.

Suppose that

y_{n} (x)

and

y_{m} (x)

are two eigenfunctions of Equation (110). According to Sturm–Liouville theory [21], they are orthogonal with respect to the weight function

w (x)

on

(a, b)

under the given conditions, such that

\int_{a}^{b} w (x) y_{n} (x) y_{m} (x) d x = (\int_{a}^{b} w (x) y_{n}^{2} (x) d x) δ_{n, m} .

(111)

Now if, in (111), there exists a specific function, say

z (x)

, such that

y_{n} (x) = Φ_{n} (x) - z (x),

(112)

then substituting (112) into (111) yields

\begin{matrix} \int_{a}^{b} w (x) (Φ_{n} (x) - z (x)) (Φ_{m} (x) - z (x)) d x \\ = \int_{a}^{b} w (x) Φ_{n} (x) Φ_{m} (x) d x - \int_{a}^{b} w (x) z (x) Φ_{n} (x) d x - \int_{a}^{b} w (x) z (x) Φ_{m} (x) d x \\ + \int_{a}^{b} w (x) z^{2} (x) d x = (\int_{a}^{b} w (x) {(Φ_{n} (x) - z (x))}^{2} d x) δ_{n, m} . \end{matrix}

(113)

On the other hand, the sequence of functions

{\{Φ_{n} (x)\}}_{n = 0}^{\infty}

is said to be simply uncorrelated if, for the known specific function

v (x)

and for every n, we have

\int_{a}^{b} v (x) Φ_{n} (x) d x = λ \geq 0,

where the non-negative parameter

λ

is independent of n.

Based on this definition, if

{\{Φ_{n} (x)\}}_{n = 0}^{\infty}

in (112) is a sequence of real functions, such that

\int_{a}^{b} w (x) z (x) Φ_{n} (x) d x = \int_{a}^{b} w (x) z^{2} (x) d x \geq 0,

(114)

then (113) will be simplified as

\begin{matrix} \int_{a}^{b} w (x) (Φ_{n} (x) - z (x)) (Φ_{m} (x) - z (x)) d x = \int_{a}^{b} w (x) Φ_{n} (x) Φ_{m} (x) d x - \int_{a}^{b} w (x) z^{2} (x) d x \\ = (\int_{a}^{b} w (x) Φ_{n}^{2} (x) d x - \int_{a}^{b} w (x) z^{2} (x) d x) δ_{n, m} . \end{matrix}

(115)

Moreover, replacing (112) in (110) gives the inhomogeneous differential equation

\begin{matrix} a (x) Φ_{n}^{″} (x) + b (x) Φ_{n}^{'} (x) + (λ_{n} u (x) + v (x)) Φ_{n} (x) \\ = a (x) z^{″} (x) + b (x) z^{'} (x) + (λ_{n} u (x) + v (x)) z (x) . \end{matrix}

(116)

Let us impose an extra condition on Equation (116) for the function

z (x)

as follows:

a (x) z^{″} (x) + b (x) z^{'} (x) + v (x) z (x) = A_{z}^{*} u (x) z (x),

(117)

where

A_{z}^{*}

is a constant value independent of x. Using relations (114), (115) and Equations (116) and (117), we can extract a remarkable result for inhomogeneous type of Sturm–Liouville equations.

Theorem 10.

Let

a (x), b (x), u (x)

and

v (x)

be four given functions continuous on

[a, b]

. If

w (x) = u (x) exp (\int \frac{b (x) - a^{'} (x)}{a (x)} d x),

(118)

is positive on

(a, b)

and the sequence

{\{Φ_{n} (x)\}}_{n = 0}^{\infty}

with the corresponding eigenvalues

{λ_{n}}

satisfies an inhomogeneous differential equation of the form

a (x) Φ_{n}^{″} (x) + b (x) Φ_{n}^{'} (x) + (λ_{n} u (x) + v (x)) Φ_{n} (x) = (λ_{n} + A_{z}^{*}) u (x) z (x),

(119)

such that

\int_{a}^{b} w (x) z (x) Φ_{n} (x) d x = \int_{a}^{b} w (x) z^{2} (x) d x,

and

\frac{a (x) z^{″} (x) + b (x) z^{'} (x) + v (x) z (x)}{u (x) z (x)} = A_{z}^{*},

where

A_{z}^{*}

is constant, and also

α_{1} Φ_{n} (a) + β_{1} Φ_{n}^{'} (a) = 0

and

α_{2} Φ_{n} (b) + β_{2} Φ_{n}^{'} (b) = 0

are two initial conditions where

α_{1}, β_{1}, α_{2} and β_{2}

are given constants, then the sequence of functions in the form

G_{n} (x) = y_{1, n} (x) - \frac{\int_{a}^{b} w (x) z (x) y_{1, n} (x) d x}{\int_{a}^{b} w (x) z (x) y_{2, n} (x) d x} y_{2, n} (x),

(120)

is orthogonal with respect to the weight function (118) on

[a, b]

provided that

\int_{a}^{b} w (x) z (x) y_{2, n} (x) d x \neq 0 (\forall n \notin Z^{+}) .

Proof.

First, write Equation (119) in the self-adjoint form

{(r (x) Φ_{n}^{'} (x))}^{'} + (λ_{n} w (x) + q (x)) Φ_{n} (x) = (λ_{n} + A_{z}^{*}) w (x) z (x),

(121)

and, for the index m, as

{(r (x) Φ_{m}^{'} (x))}^{'} + (λ_{m} w (x) + q (x)) Φ_{m} (x) = (λ_{m} + A_{z}^{*}) w (x) z (x),

(122)

in which

r (x) = a (x) exp (\int \frac{b (x) - a^{'} (x)}{a (x)} d x)

, and

q (x) = v (x) exp (\int \frac{b (x) - a^{'} (x)}{a (x)} d x)

.

On multiplying by $Φ_{m} (x)$ and $Φ_{n} (x)$ in, respectively, (121) and (122) and subtracting, we get

\begin{matrix} {[r (x) (Φ_{m} (x) Φ_{n}^{'} (x) - Φ_{n} (x) Φ_{m}^{'} (x))]}_{a}^{b} + (λ_{n} - λ_{m}) \int_{a}^{b} w (x) Φ_{n} (x) Φ_{m} (x) d x \\ = (λ_{n} + A_{z}^{*}) \int_{a}^{b} w (x) z (x) Φ_{m} (x) d x - (λ_{m} + A_{z}^{*}) \int_{a}^{b} w (x) z (x) Φ_{n} (x) d x \\ = (λ_{n} - λ_{m}) \int_{a}^{b} w (x) z^{2} (x) d x, \end{matrix}

which results in

\begin{matrix} (λ_{n} - λ_{m}) (\int_{a}^{b} w (x) Φ_{n} (x) Φ_{m} (x) d x - \int_{a}^{b} w (x) z^{2} (x) d x) \\ = - {[r (x) (Φ_{m} (x) Φ_{n}^{'} (x) - Φ_{n} (x) Φ_{m}^{'} (x))]}_{a}^{b} = 0 \Leftrightarrow n \neq m . \end{matrix}

But on the other hand, Equation (119) is inhomogeneous, and its general solution takes the form

Φ_{n} (x) = c_{1} y_{1, n} (x) + c_{2} y_{2, n} (x) + z (x) (c_{1}, c_{2} \in R),

(123)

where

c_{1} y_{1, n} (x) + c_{2} y_{2, n} (x)

is the solution of the homogeneous equation

a (x) Φ_{n}^{″} (x) + b (x) Φ_{n}^{'} (x) + (λ_{n} p (x) + q (x)) Φ_{n} (x) = 0,

with two independent solutions,

y_{1, n} (x)

and

y_{2, n} (x)

, respectively. So, substituting (123) into (114) simplifies the solution as

Φ_{n} (x) = c_{1} (y_{1, n} (x) - \frac{\int_{a}^{b} w (x) z (x) y_{1, n} (x) d x}{\int_{a}^{b} w (x) z (x) y_{2, n} (x) d x} y_{2, n} (x)) + z (x),

(124)

provided that

\int_{a}^{b} w (x) z (x) y_{2, n} (x) d x \neq 0

for every

n \notin Z^{+}

. Finally, for

c_{1} = 1

, the sequence

Φ_{n} (x) - z (x) = y_{1, n} (x) - \frac{\int_{a}^{b} w (x) z (x) y_{1, n} (x) d x}{\int_{a}^{b} w (x) z (x) y_{2, n} (x) d x} y_{2, n} (x),

would be orthogonal with respect to the weight function

w (x)

in (118) on

[a, b]

according to the relation (115). □

Remark 4.

The given condition

\int_{a}^{b} w (x) z (x) y_{2, n} (x) d x \neq 0

in (124) reveals the importance of the role of

z (x) \neq 0

in the above-mentioned theorem. Moreover, if it is a constant function, say

z (x) = 1

, we will not be able anymore to use the theorem because

{y_{1, n} (x)}

is orthogonal, and we have (especially for orthogonal polynomial sequences) the case in which

\int_{a}^{b} w (x) y_{1, n} (x) d x = 0

.

In other words, for $z (x) = 1$ , the sequence $G_{n} (x)$ in (120) will be reduced to the single case $G_{n} (x) = y_{1, n} (x)$ .

Finally, when it is noted in (124) that

\int_{a}^{b} w (x) z (x) Φ_{n} (x) d x = \int_{a}^{b} w (x) z^{2} (x) d x

, the norm square value is explicitly computed as follows:

\begin{matrix} \int_{a}^{b} w (x) G_{n}^{2} (x) d x = \int_{a}^{b} w (x) y_{1, n}^{2} (x) d x + {(\frac{\int_{a}^{b} w (x) z (x) y_{1, n} (x) d x}{\int_{a}^{b} w (x) z (x) y_{2, n} (x) d x})}^{2} \int_{a}^{b} w (x) y_{2, n}^{2} (x) d x \\ - 2 \frac{\int_{a}^{b} w (x) z (x) y_{1, n} (x) d x}{\int_{a}^{b} w (x) z (x) y_{2, n} (x) d x} \int_{a}^{b} w (x) y_{1, n} (x) y_{2, n} (x) d x . \end{matrix}

6.3.1. New Classes of Orthogonal Functions

In order to introduce some new sequences of orthogonal functions using Theorem 10, we first recall some comments available in the literature. A generalized hypergeometric series of order

(p, q)

is defined as

{}_{p}F_{q} (\begin{matrix} a_{1}, \dots, a_{p} \\ b_{1}, \dots, b_{q} \end{matrix}| x) = \sum_{k = 0}^{\infty} \frac{{(a_{1})}_{k} \dots {(a_{p})}_{k}}{{(b_{1})}_{k} \dots {(b_{q})}_{k}} \frac{x^{k}}{k!},

(125)

where

{(a)}_{k} = \frac{Γ (a + k)}{Γ (a)} = a (a + 1) \dots (a + k - 1)

denotes the Pochhammer symbol, and

Γ (x) = \int_{0}^{\infty} t^{x - 1} e^{- t} d t, (Re (x) > 0),

is the well-known gamma function.

There are two special cases of the series (125) arising in many physics and engineering problems [20,25]. The first case is the Gauss hypergeometric function whose integral representation

{}_{2}F_{1} (\begin{matrix} \begin{matrix} a, & b \end{matrix} \\ c \end{matrix}| x) = \frac{Γ (c)}{Γ (a) Γ (c - a)} \int_{0}^{1} t^{a - 1} {(1 - t)}^{c - a - 1} {(1 - x t)}^{- b} d t (c > a > 0),

(126)

satisfies the differential equation

x (1 - x) y^{″} (x) + (c - (a + b + 1) x) y^{'} (x) - a b y (x) = 0,

(127)

with the general solution

y = c_{1} {}_{2}F_{1} (\begin{matrix} a, b \\ c \end{matrix}| x) + c_{2} x^{1 - c} {}_{2}F_{1} (\begin{matrix} a + 1 - c, b + 1 - c \\ 2 - c \end{matrix}| x) .

(128)

When the well-known beta integral is noted, substituting

x = 1

into (126) gives

{}_{2}F_{1} (\begin{matrix} a, b \\ c \end{matrix}| 1) = \frac{Γ (c) Γ (c - a - b)}{Γ (c - a) Γ (c - b)} .

The second case is known in the literature as the confluent hypergeometric function with the integral representation

{}_{1}F_{1} (\begin{matrix} a \\ c \end{matrix}| x) = \frac{Γ (c)}{Γ (a) Γ (c - a)} \int_{0}^{1} t^{a - 1} {(1 - t)}^{c - a - 1} e^{x t} d t (c > a > 0),

satisfying the equation

x y^{″} (x) + (c - x) y^{'} (x) - a y (x) = 0,

(129)

whose general solution is given by

y = b_{1} {}_{1}F_{1} (\begin{matrix} a \\ c \end{matrix}| x) + b_{2} x^{1 - c} {}_{1}F_{1} (\begin{matrix} a + 1 - c \\ 2 - c \end{matrix}| x) .

(130)

In this sense, there are two sequences of classical orthogonal polynomials that can be represented in terms of Gauss and confluent hypergeometric series. The first sequence is the shifted monic Jacobi polynomials

{\bar{P}}_{n, +}^{(α, β)} (x) = \frac{{(- 1)}^{n} {(β + 1)}_{n}}{{(n + α + β + 1)}_{n}} {}_{2}F_{1} (\begin{matrix} - n, n + α + β + 1 \\ β + 1 \end{matrix}| x),

(131)

that are orthogonal with respect to the weight function

{(1 - x)}^{α} x^{β}

on

[0, 1]

as

\int_{0}^{1} {(1 - x)}^{α} x^{β} {\bar{P}}_{m, +}^{(α, β)} (x) {\bar{P}}_{n, +}^{(α, β)} (x) d x = n! 2^{2 n} \frac{Γ (n + α + β + 1) Γ (n + α + 1) Γ (n + β + 1)}{Γ (2 n + α + β + 1) Γ (2 n + α + β + 2)} δ_{n, m},

and the second sequence is the monic (generalized) Laguerre polynomials

{\bar{L}}_{n}^{(α)} (x) = {(- 1)}^{n} {(α + 1)}_{n} {}_{1}F_{1} (\begin{matrix} - n \\ α + 1 \end{matrix}| x),

(132)

that are orthogonal with respect to the weight function

x^{α} e^{- x}

on

[0, \infty)

as

\int_{0}^{\infty} x^{α} e^{- x} {\bar{L}}_{m}^{(α)} (x) {\bar{L}}_{n}^{(α)} (x) d x = n! Γ (n + α + 1) δ_{n, m} .

We can now introduce some sequences of orthogonal functions by means of Theorem 10.

First Sequence

With reference to the Gauss hypergeometric Equation (127), consider the differential equation

\begin{matrix} x (1 - x) {Φ^{″}}_{n} (x) + (c - (d + 1) x) {Φ^{'}}_{n} (x) + n (n + d) Φ_{n} (x) \\ = x (1 - x) z^{″} (x) + (c - (d + 1) x) z^{'} (x) + n (n + d) z (x), \end{matrix}

(133)

with the general solution

\begin{matrix} Φ_{n} (x) & = c_{1} {}_{2}F_{1} (\begin{matrix} - n, n + d \\ c \end{matrix}| x) \\ + c_{2} x^{1 - c} {}_{2}F_{1} (\begin{matrix} - n + 1 - c, n + d + 1 - c \\ 2 - c \end{matrix}| x) + z (x) . \end{matrix}

(134)

For a known parameter e, assume in (133) that

x (1 - x) z^{″} (x) + (c - (d + 1) x) z^{'} (x) = - e (e + d) z (x),

(135)

which has a general solution, according to (127) and (128), as

\begin{matrix} z (x) & = z (x; A, B, c, d, e) \\ = A {}_{2}F_{1} (\begin{matrix} - e, e + d \\ c \end{matrix}| x) + B x^{1 - c} {}_{2}F_{1} (\begin{matrix} - e + 1 - c, e + d + 1 - c \\ 2 - c \end{matrix}| x), \end{matrix}

(136)

in which

A, B

are two free parameters. It turns out from (133) to (136) that the general solution of the equation

\begin{matrix} x (1 - x) Φ_{n}^{″} (x) + (c - (d + 1) x) Φ_{n}^{'} (x) + n (n + d) Φ_{n} (x) = (n (n + d) - e (e + d)) \\ \times (A {}_{2}F_{1} (\begin{matrix} - e, e + d \\ c \end{matrix}| x) + B x^{1 - c} {}_{2}F_{1} (\begin{matrix} - e + 1 - c, e + d + 1 - c \\ 2 - c \end{matrix}| x)), \end{matrix}

(137)

is as

\begin{matrix} Φ_{n} (x) = c_{1} {}_{2}F_{1} (\begin{matrix} - n, n + d \\ c \end{matrix}| x) + c_{2} x^{1 - c} {}_{2}F_{1} (\begin{matrix} - n + 1 - c, n + d + 1 - c \\ 2 - c \end{matrix}| x) \\ + A {}_{2}F_{1} (\begin{matrix} - e, e + d \\ c \end{matrix}| x) + B x^{1 - c} {}_{2}F_{1} (\begin{matrix} - e + 1 - c, e + d + 1 - c \\ 2 - c \end{matrix}| x) . \end{matrix}

(138)

Note that the sequence (138) is also the general solution of the fourth-order equation

\begin{matrix} x (1 - x) {(x (1 - x) Φ_{n}^{″} (x) + (c - (d + 1) x) Φ_{n}^{'} (x) + n (n + d) Φ_{n} (x))}^{″} \\ + (c - (d + 1) x) {(x (1 - x) Φ_{n}^{″} (x) + (c - (d + 1) x) Φ_{n}^{'} (x) + n (n + d) Φ_{n} (x))}^{'} \\ + e (e + d) (x (1 - x) Φ_{n}^{″} (x) + (c - (d + 1) x) Φ_{n}^{'} (x) + n (n + d) Φ_{n} (x)) = 0 . \end{matrix}

Considering that there are five parameters for

z (x; A, B, c, d, e)

in (136) and the well-known identity

{}_{2}F_{1} (a, c; c; x) = {(1 - x)}^{- a}

, various cases can be considered for Equation (137) and its solution in (138). Here, we introduce four orthogonal subsequences of (138).

(i): First subsequence. For two given parameters $p, q \in R$ , if

(A, B, c, d, e) = (0, 1, 1 + p, 1 + p + q, - p - q),

are replaced in (137) as

\begin{matrix} x (1 - x) Φ_{n}^{″} (x) - ((p + q + 2) x - p - 1) Φ_{n}^{'} (x) + n (n + p + q + 1) Φ_{n} (x) \\ = (n + 1) (n + p + q) x^{- p} {(1 - x)}^{- q}, \end{matrix}

(139)

its general solution, according to (138), reads as

\begin{matrix} Φ_{n} (x) & = c_{1} {}_{2}F_{1} (\begin{matrix} - n, n + p + q + 1 \\ p + 1 \end{matrix}| x) \\ + c_{2} x^{- p} {}_{2}F_{1} (\begin{matrix} - n - p, n + 1 + q \\ 1 - p \end{matrix}| x) + x^{- p} {(1 - x)}^{- q} . \end{matrix}

(140)

As the weight function corresponding to the self adjoint form of Equation (139) is

w (x; p, q) = x^{p} {(1 - x)}^{q}

and

z (x) = x^{- p} {(1 - x)}^{- q}

, applying the condition (114) on the solution (140) gives

c_{1} \int_{0}^{1} {}_{2}F_{1} (\begin{matrix} - n, n + p + q + 1 \\ p + 1 \end{matrix}| x) d x + c_{2} \int_{0}^{1} x^{- p} {}_{2}F_{1} (\begin{matrix} - n - p, n + 1 + q \\ 1 - p \end{matrix}| x) d x = 0 .

(141)

Since in (141),

\begin{matrix} \int_{0}^{1} {}_{2}F_{1} (\begin{matrix} - n, n + p + q + 1 \\ p + 1 \end{matrix}| x) d x = \sum_{k = 0}^{n} \frac{{(- n)}_{k} {(n + p + q + 1)}_{k}}{{(p + 1)}_{k} k!} \frac{1}{1 + k} \\ = {}_{3}F_{2} (\begin{matrix} - n, n + p + q + 1, 1 \\ p + 1, 2 \end{matrix}| 1) = \frac{p}{(n + 1) (n + p + q)} (1 + {(- 1)}^{n} \frac{{(q)}_{n + 1}}{{(p)}_{n + 1}}), \end{matrix}

(142)

and

\begin{matrix} \int_{0}^{1} x^{- p} {}_{2}F_{1} (\begin{matrix} - n - p, n + 1 + q \\ 1 - p \end{matrix}| x) d x = \sum_{k = 0}^{n} \frac{{(- n - p)}_{k} {(n + 1 + q)}_{k}}{{(1 - p)}_{k} k!} \frac{1}{1 - p + k} \\ = \frac{1}{1 - p} {}_{2}F_{1} (\begin{matrix} - n - p, n + 1 + q \\ 2 - p \end{matrix}| 1) = \frac{{(- 1)}^{n} {(p + q)}_{n}}{(n + 1)!} B (1 - p, 1 - q), \end{matrix}

the first orthogonal subsequence appears as

\begin{matrix} G_{n}^{(1)} (x; p, q) = {}_{2}F_{1} (\begin{matrix} - n, n + p + q + 1 \\ p + 1 \end{matrix}| x) \\ - \frac{n! ({(- 1)}^{n} {(p)}_{n + 1} + {(q)}_{n + 1})}{B (1 - p, 1 - q) {(p + q)}_{n + 1} {(p + 1)}_{n}} x^{- p} {}_{2}F_{1} (\begin{matrix} - n - p, n + 1 + q \\ 1 - p \end{matrix}| x), \end{matrix}

(143)

which is orthogonal with respect to the weight function

w (x; p, q) = x^{p} {(1 - x)}^{q}

on

[0, 1]

if and only if

p, q \in (- 1, 0)

, because

Φ_{n} (x)

must be convergent at the points

x = 0, 1

in (140).

To derive (142), we have used the identity

{}_{3}F_{2} (\begin{matrix} a_{1}, a_{2}, 1 \\ b_{1}, 2 \end{matrix}| 1) = \frac{b_{1} - 1}{(a_{1} - 1) (a_{2} - 1)} ({}_{2}F_{1} (\begin{matrix} a_{1} - 1, a_{2} - 1 \\ b_{1} - 1 \end{matrix}| 1) - 1) .

Also, notice that the first term of (143) is the same as shifted Jacobi polynomials

P_{n, +}^{(q, p)} (x)

defined in (131).

(ii): Second subsequence. If $(A, B, c, d, e) = (0, 1, 1 + p, 1 + p + q, - p)$ are replaced in (137) as

\begin{matrix} x (1 - x) Φ_{n}^{″} (x) - ((p + q + 2) x - p - 1) Φ_{n}^{'} (x) + n (n + p + q + 1) Φ_{n} (x) \\ = (n + 1) (n + q + 1) x^{- p}, \end{matrix}

(144)

its general solution, according to (138), is as

Φ_{n} (x) = c_{1} {}_{2}F_{1} (\begin{matrix} - n, n + p + q + 1 \\ p + 1 \end{matrix}| x) + c_{2} x^{- p} {}_{2}F_{1} (\begin{matrix} - n - p, n + 1 + q \\ 1 - p \end{matrix}| x) + x^{- p} .

(145)

As

z (x) = x^{- p}

in (144), applying the condition (114) on the solution (145) gives

\begin{matrix} c_{1} \int_{0}^{1} {(1 - x)}^{q} {}_{2}F_{1} (\begin{matrix} - n, n + p + q + 1 \\ p + 1 \end{matrix}| x) d x \\ + c_{2} \int_{0}^{1} x^{- p} {(1 - x)}^{q} {}_{2}F_{1} (\begin{matrix} - n - p, n + 1 + q \\ 1 - p \end{matrix}| x) d x = 0 . \end{matrix}

(146)

Since in (146),

\begin{matrix} \int_{0}^{1} {(1 - x)}^{q} {}_{2}F_{1} (\begin{matrix} - n, n + p + q + 1 \\ p + 1 \end{matrix}| x) d x = \sum_{k = 0}^{n} \frac{{(- n)}_{k} {(n + p + q + 1)}_{k}}{{(p + 1)}_{k} k!} \frac{{(1)}_{k}}{(q + 1) {(q + 2)}_{k}} \\ = \frac{1}{q + 1} {}_{3}F_{2} (\begin{matrix} - n, n + p + q + 1, 1 \\ p + 1, q + 2 \end{matrix}| 1) = \frac{p}{(p + n) (q + 1 + n)}, \end{matrix}

(147)

and

\begin{matrix} \int_{0}^{1} x^{- p} {(1 - x)}^{q} {}_{2}F_{1} (\begin{matrix} - n - p, n + 1 + q \\ 1 - p \end{matrix}| x) d x \\ = \sum_{k = 0}^{n} \frac{{(- n - p)}_{k} {(n + 1 + q)}_{k}}{{(1 - p)}_{k} k!} B (k + 1 - p, q + 1) \\ = B (1 - p, q + 1) {}_{2}F_{1} (\begin{matrix} - n - p, n + 1 + q \\ 2 - p + q \end{matrix}| 1) = \frac{{(- 1)}^{n} {(p)}_{n}}{(q + 1) {(q + 2)}_{n}}, \end{matrix}

the second subsequence is finally derived as

\begin{matrix} G_{n}^{(2)} (x; p, q) & = {}_{2}F_{1} (\begin{matrix} - n, n + p + q + 1 \\ p + 1 \end{matrix}| x) \\ - {(- 1)}^{n} \frac{{(q + 1)}_{n}}{{(p + 1)}_{n}} x^{- p} {}_{2}F_{1} (\begin{matrix} - n - p, n + 1 + q \\ 1 - p \end{matrix}| x), \end{matrix}

(148)

which is orthogonal with respect to the weight function

w (x; p, q) = x^{p} {(1 - x)}^{q}

on

[0, 1]

for

p \in (- 1, 0)

and

q > - 1

. To derive (147), we have used the Pfaff–Saalschutz summation theorem [26]

{}_{3}F_{2} (\begin{matrix} - n, a, b \\ d, e \end{matrix}| 1) = \frac{{(d - a)}_{n} {(d - b)}_{n}}{{(d)}_{n} {(d - a - b)}_{n}} \Leftrightarrow - n + a + b + 1 = d + e .

(iii): Third subsequence. If $(A, B, c, d, e) = (1, 0, 1 + p, 1 + p + q, - q)$ are replaced as

\begin{matrix} x (1 - x) Φ_{n}^{″} (x) - ((p + q + 2) x - p - 1) Φ_{n}^{'} (x) + n (n + p + q + 1) Φ_{n} (x) \\ = (n + q) (n + p + 1) {(1 - x)}^{- q}, \end{matrix}

(149)

its general solution reads as

\begin{matrix} Φ_{n} (x) & = c_{1} {}_{2}F_{1} (\begin{matrix} - n, n + p + q + 1 \\ p + 1 \end{matrix}| x) \\ + c_{2} x^{- p} {}_{2}F_{1} (\begin{matrix} - n - p, n + 1 + q \\ 1 - p \end{matrix}| x) + {(1 - x)}^{- q} . \end{matrix}

(150)

As

z (x) = {(1 - x)}^{- q}

in (149), applying the condition (114) on the solution (150) yields

\begin{matrix} c_{1} \int_{0}^{1} x^{p} {}_{2}F_{1} (\begin{matrix} - n, n + p + q + 1 \\ p + 1 \end{matrix}| x) d x \\ + c_{2} \int_{0}^{1} {}_{2}F_{1} (\begin{matrix} - n - p, n + 1 + q \\ 1 - p \end{matrix}| x) d x = 0 . \end{matrix}

(151)

Since in (151),

\begin{matrix} \int_{0}^{1} x^{p} {}_{2}F_{1} (\begin{matrix} - n, n + p + q + 1 \\ p + 1 \end{matrix}| x) d x = \sum_{k = 0}^{n} \frac{{(- n)}_{k} {(n + p + q + 1)}_{k}}{{(p + 1)}_{k} k!} \frac{1}{p + 1 + k} \\ = \frac{1}{p + 1} {}_{2}F_{1} (\begin{matrix} - n, n + p + q + 1 \\ p + 2 \end{matrix}| 1) = \frac{{(- 1)}^{n} {(q)}_{n}}{{(p + 1)}_{n + 1}}, \end{matrix}

and

\begin{matrix} \int_{0}^{1} {}_{2}F_{1} (\begin{matrix} - n - p, n + 1 + q \\ 1 - p \end{matrix}| x) d x = \sum_{k = 0}^{n} \frac{{(- n - p)}_{k} {(n + 1 + q)}_{k}}{{(1 - p)}_{k} k!} \frac{{(1)}_{k}}{{(2)}_{k}} \\ = {}_{3}F_{2} (\begin{matrix} - n - p, n + q + 1, 1 \\ 1 - p, 2 \end{matrix}| 1) \\ = \frac{- p}{(n + p + 1) (n + q)} (1 + \frac{{(- 1)}^{n} q B (- p, - q) {(p + q + 1)}_{n}}{n!}), \end{matrix}

the third subsequence appears as

\begin{matrix} G_{n}^{(3)} (x; p, q) = {}_{2}F_{1} (\begin{matrix} - n, n + p + q + 1 \\ p + 1 \end{matrix}| x) \\ + \frac{{(- 1)}^{n} n! {(q)}_{n + 1} x^{- p}}{{(p)}_{n + 1} (n! + {(- 1)}^{n} B (- p, - q) {(p + q + 1)}_{n})} {}_{2}F_{1} (\begin{matrix} - n - p, n + 1 + q \\ 1 - p \end{matrix}| x), \end{matrix}

(152)

which is orthogonal with respect to the weight

x^{p} {(1 - x)}^{q}

on

[0, 1]

for

- 1 < p, q < 0

.

From (143), (148) and (152), we can conclude that, in fact,

G_{n}^{(k)} (x; p, q) = {}_{2}F_{1} (\begin{matrix} - n, n + p + q + 1 \\ p + 1 \end{matrix}| x) + λ_{n}^{(k)} (p, q) x^{- p} {}_{2}F_{1} (\begin{matrix} - n - p, n + 1 + q \\ 1 - p \end{matrix}| x),

where

\begin{matrix} λ_{n}^{(1)} (p, q) = - \frac{n! ({(- 1)}^{n} {(p)}_{n + 1} + {(q)}_{n + 1})}{B (1 - p, 1 - q) {(p + q)}_{n + 1} {(p + 1)}_{n}}, \\ λ_{n}^{(2)} (p, q) = - {(- 1)}^{n} \frac{{(q + 1)}_{n}}{{(p + 1)}_{n}}, \end{matrix}

and

λ_{n}^{(3)} (p, q) = \frac{{(- 1)}^{n} n! {(q)}_{n + 1}}{{(p)}_{n + 1} (n! + {(- 1)}^{n} B (- p, - q) {(p + q + 1)}_{n})} .

However, as we pointed out, there are also other orthogonal sequences that are different from the above-mentioned general form. We consider one case here.

(iv): Fourth subsequence. If $(A, B, c, d, e) = (A, B, 1 + p, 1 + 2 p, - p)$ , where $A, B$ are two free parameters, are replaced in (137) as

\begin{matrix} x (1 - x) Φ_{n}^{″} (x) + (p + 1 - (2 p + 2) x) Φ_{n}^{'} (x) + n (n + 2 p + 1) Φ_{n} (x) \\ = (n + p) (n + p + 1) (A {(1 - x)}^{- p} + B x^{- p}), \end{matrix}

(153)

its general solution is

\begin{matrix} Φ_{n} (x) & = c_{1} {}_{2}F_{1} (\begin{matrix} - n, n + 2 p + 1 \\ p + 1 \end{matrix}| x) \\ + c_{2} x^{- p} {}_{2}F_{1} (\begin{matrix} - n - p, n + 1 + p \\ 1 - p \end{matrix}| x) + A {(1 - x)}^{- p} + B x^{- p} . \end{matrix}

(154)

As

z (x) = A {(1 - x)}^{- p} + B x^{- p}

in (153), applying (114) to the solution (154) gives

\begin{matrix} c_{1} \int_{0}^{1} (A x^{p} + B {(1 - x)}^{p}) {}_{2}F_{1} (\begin{matrix} - n, n + 2 p + 1 \\ p + 1 \end{matrix}| x) d x \\ + c_{2} \int_{0}^{1} (A + B x^{- p} {(1 - x)}^{p}) {}_{2}F_{1} (\begin{matrix} - n - p, n + p + 1 \\ 1 - p \end{matrix}| x) d x = 0 . \end{matrix}

(155)

Since, in (155),

\int_{0}^{1} (A x^{p} + B {(1 - x)}^{p}) {}_{2}F_{1} (\begin{matrix} - n, n + 2 p + 1 \\ p + 1 \end{matrix}| x) d x = ({(- 1)}^{n} A + B) \frac{p}{(n + p) (n + p + 1)},

and

\begin{matrix} \int_{0}^{1} (A + B x^{- p} {(1 - x)}^{p}) {}_{2}F_{1} (\begin{matrix} - n - p, n + p + 1 \\ 1 - p \end{matrix}| x) d x \\ = \frac{p}{(n + p) (n + p + 1)} (- A (1 + \frac{{(- 1)}^{n} p B (- p, - q) {(2 p + 1)}_{n}}{n!}) + {(- 1)}^{n} B), \end{matrix}

the fourth subsequence appears as

\begin{matrix} G_{n}^{(4)} (x; p) = {}_{2}F_{1} (\begin{matrix} - n, n + 2 p + 1 \\ p + 1 \end{matrix}| x) \\ + \frac{({(- 1)}^{n} A + B) x^{- p}}{({(- 1)}^{n} + \frac{p}{n!} B (- p, - p) {(2 p + 1)}_{n}) A - B} {}_{2}F_{1} (\begin{matrix} - n - p, n + p + 1 \\ 1 - p \end{matrix}| x), \end{matrix}

(156)

which is orthogonal with respect to the weight function

w (x; p) = x^{p} {(1 - x)}^{p}

on

[0, 1]

for

- 1 < p < 0

.

Second Sequence

Noting the confluent hypergeometric Equation (129), this time, we consider the differential equation

x Φ_{n}^{″} (x) + (c - x) Φ_{n}^{'} (x) + n Φ_{n} (x) = x z^{″} (x) + (c - x) z^{'} (x) + n z (x),

(157)

with the general solution

Φ_{n} (x) = b_{1} {}_{1}F_{1} (\begin{matrix} - n \\ c \end{matrix}| x) + b_{2} x^{1 - c} {}_{1}F_{1} (\begin{matrix} - n + 1 - c \\ 2 - c \end{matrix}| x) + z (x) .

(158)

For a known parameter e, assume in (157) that

x z^{″} (x) + (c - x) z^{'} (x) = - e z (x),

(159)

having a general solution, according to (129) and (130), as

z (x) = z (x; A^{*}, B^{*}, c, e) = A^{*} {}_{1}F_{1} (\begin{matrix} - e \\ c \end{matrix}| x) + B^{*} x^{1 - c} {}_{1}F_{1} (\begin{matrix} - e + 1 - c \\ 2 - c \end{matrix}| x),

(160)

in which

A^{*}, B^{*}

are free parameters. Consequently, from (157) to (160), we can conclude that the general solution of the equation

\begin{matrix} x Φ_{n}^{″} (x) + (c - x) Φ_{n}^{'} (x) + n Φ_{n} (x) \\ = (n - e) (A^{*} {}_{1}F_{1} (\begin{matrix} - e \\ c \end{matrix}| x) + B^{*} x^{1 - c} {}_{1}F_{1} (\begin{matrix} - e + 1 - c \\ 2 - c \end{matrix}| x)), \end{matrix}

(161)

is

\begin{matrix} Φ_{n} (x) = b_{1} {}_{1}F_{1} (\begin{matrix} - n \\ c \end{matrix}| x) + b_{2} x^{1 - c} {}_{1}F_{1} (\begin{matrix} - n + 1 - c \\ 2 - c \end{matrix}| x) \\ + A^{*} {}_{1}F_{1} (\begin{matrix} - e \\ c \end{matrix}| x) + B^{*} x^{1 - c} {}_{1}F_{1} (\begin{matrix} - e + 1 - c \\ 2 - c \end{matrix}| x) . \end{matrix}

(162)

Note that (162) is also the general solution of the fourth-order equation

\begin{matrix} x {(x Φ_{n}^{″} (x) + (c - x) Φ_{n}^{'} (x) + n Φ_{n} (x))}^{″} + (c - x) {(x Φ_{n}^{″} (x) + (c - x) Φ_{n}^{'} (x) + n Φ_{n} (x))}^{'} \\ + e (x Φ_{n}^{″} (x) + (c - x) Φ_{n}^{'} (x) + n Φ_{n} (x)) = 0 . \end{matrix}

Similar to the preceding First Sequence Section, there are various subcases for the sequence (162). Here, we consider one case. For the given parameters

p \in R

, if

(A^{*}, B^{*}, c, e) = (0, 1, 1 + p, - p),

are replaced in (161) as

x Φ_{n}^{″} (x) + (p + 1 - x) Φ_{n}^{'} (x) + n Φ_{n} (x) = (n + p) x^{- p},

(163)

then its general solution reads as

Φ_{n} (x; p) = b_{1} {}_{1}F_{1} (\begin{matrix} - n \\ p + 1 \end{matrix}| x) + b_{2} x^{- p} {}_{1}F_{1} (\begin{matrix} - n - p \\ 1 - p \end{matrix}| x) + x^{- p} .

(164)

As the weight function corresponding to the self adjoint form of Equation (163) is

w (x; p) = x^{p} e^{- x}

defined on

[0, \infty)

and

z (x) = x^{- p}

, applying the condition (114) on the solution (164) gives

b_{1} \int_{0}^{\infty} e^{- x} {}_{1}F_{1} (\begin{matrix} - n \\ p + 1 \end{matrix}| x) d x + b_{2} \int_{0}^{\infty} x^{- p} e^{- x} {}_{1}F_{1} (\begin{matrix} - n - p \\ 1 - p \end{matrix}| x) d x = 0 .

(165)

But since in (165),

\int_{0}^{\infty} x^{- p} e^{- x} {}_{1}F_{1} (\begin{matrix} - n - p \\ 1 - p \end{matrix}| x) d x = Γ (1 - p) \sum_{k = 0}^{n} \frac{{(- n - p)}_{k}}{k!} = 0,

and

\begin{matrix} \int_{0}^{\infty} e^{- x} {}_{1}F_{1} (\begin{matrix} - n \\ p + 1 \end{matrix}| x) d x = \sum_{k = 0}^{n} \frac{{(- n)}_{k} {(1)}_{k}}{{(p + 1)}_{k} k!} = {}_{2}F_{1} (\begin{matrix} - n, 1 \\ p + 1 \end{matrix}| 1) = \frac{p}{p + n} \neq 0 \\ for p \neq 0, \end{matrix}

choosing

b_{1} = 0

in (164) eventually yields

G_{n} (x; p) = x^{- p} {}_{1}F_{1} (\begin{matrix} - n - p \\ 1 - p \end{matrix}| x),

which is orthogonal with respect to the weight

x^{p} e^{- x}

on

[0, \infty)

if and only if

p \in (- 1, 1)

. Let us prove this latter result via its differential equation, too. If equation (163) is written in the self-adjoint form

{(x^{p + 1} e^{- x} Φ_{n}^{'} (x))}^{'} + n x^{p} e^{- x} Φ_{n} (x) = (n + p) e^{- x},

and for the index m as

{(x^{p + 1} e^{- x} Φ_{m}^{'} (x))}^{'} + m x^{p} e^{- x} Φ_{m} (x) = (m + p) e^{- x},

we get

\begin{matrix} {[x^{p + 1} e^{- x} (Φ_{m} (x) Φ_{n}^{'} (x) - Φ_{n} (x) Φ_{m}^{'} (x))]}_{0}^{\infty} + (n - m) \int_{0}^{\infty} x^{p} e^{- x} Φ_{n} (x) Φ_{m} (x) d x \\ = (n + p) \int_{0}^{\infty} e^{- x} Φ_{m} (x) d x - (m + p) \int_{0}^{\infty} e^{- x} Φ_{n} (x) d x = (n - m) Γ (1 - p), \end{matrix}

which results in

\begin{matrix} (n - m) (\int_{0}^{\infty} x^{p} e^{- x} Φ_{n} (x) Φ_{m} (x) d x - Γ (1 - p)) \\ = - {[x^{p + 1} e^{- x} (Φ_{m} (x) Φ_{n}^{'} (x) - Φ_{n} (x) Φ_{m}^{'} (x))]}_{0}^{\infty} = 0 \Leftrightarrow n \neq m, - 1 < p < 1 . \end{matrix}

Finally, since

\int_{0}^{\infty} x^{p} e^{- x} G_{n} (x; p) G_{m} (x; p) d x = \int_{0}^{\infty} x^{- p} e^{- x} {}_{1}F_{1} (\begin{matrix} - n - p \\ 1 - p \end{matrix}| x) {}_{1}F_{1} (\begin{matrix} - m - p \\ 1 - p \end{matrix}| x) d x,

the simplified sequence

G_{n}^{*} (x; p) = {}_{1}F_{1} (\begin{matrix} - n + p \\ 1 + p \end{matrix}| x)

is orthogonal with respect to the weight function

x^{p} e^{- x}

on

[0, \infty)

if

- 1 < p < 1

.

7. p-Uncorrelated Polynomials Sequence (p-UPS)

As the most important part of p-uncorrelated functions is the class of p-uncorrelated polynomials, we prefer to study them in a separate section and consider their monic form ( i.e., with leading coefficient equal to 1generality) in a continuous space. With reference to Theorem 2 (or representation (43)) and using the initial monomial data

{V_{k} = x^{k}}_{k = 0}^{n}

, a monic type of p-UPS with respect to

z (x)

, say

{\bar{P}}_{n} (x, p; z (x))

, can be generated, so that

{cov}_{p} ({\bar{P}}_{n} (x, p; z (x)), {\bar{P}}_{m} (x, p; z (x)); z (x)) = 0 \Leftrightarrow n \neq m .

(166)

In a weighted continuous space, the condition (166) is equivalent to

\begin{matrix} \int_{a}^{b} w (x) {\bar{P}}_{n} (x, p; z (x)) {\bar{P}}_{m} (x, p; z (x)) d x \\ - p \frac{\int_{a}^{b} w (x) z (x) {\bar{P}}_{n} (x, p; z (x)) d x \int_{a}^{b} w (x) z (x) {\bar{P}}_{m} (x, p; z (x)) d x}{\int_{a}^{b} w (x) z^{2} (x) d x} \\ = (\int_{a}^{b} w (x) {\bar{P}}_{n}^{2} (x, p; z (x)) d x - p \frac{{(\int_{a}^{b} w (x) z (x) {\bar{P}}_{n} (x, p; z (x)) d x)}^{2}}{\int_{a}^{b} w (x) z^{2} (x) d x}) δ_{n, m}, \end{matrix}

(167)

where

w (x) > 0

and

z (x)

are two continuous functions, such that

\int_{a}^{b} w (x) z^{2} (x) d x > 0

.

The existence of relation (167) is when the integrals

\int_{a}^{b} w (x) x^{n} z (x) d x

all exist and are bounded for every

n = 0, 1, \dots

. In this section, we deal with such spaces of type (167) as the role of

z (x)

would be undeniable. Let us recall that a hypergeometric polynomial as a particular case of the generalized hypergeometric series

{}_{p}F_{q} (\begin{matrix} a_{1}, \dots, a_{p} \\ b_{1}, \dots, b_{q} \end{matrix}| x) = \sum_{k = 0}^{\infty} \frac{{(a_{1})}_{k} \dots {(a_{p})}_{k}}{{(b_{1})}_{k} \dots {(b_{q})}_{k}} \frac{x^{k}}{k!},

where

b_{1}, \dots, b_{q} \neq 0, - 1, - 2, \dots

and

{(a)}_{k} = a (a + 1) \dots (a + k - 1)

, appears when one of the values

{a_{k}}_{k = 1}^{p}

is a negative integer. Also, we recall that the aforesaid series is convergent for all x if

p < q + 1

, convergent for

x \in (- 1, 1)

if

p = q + 1

, and it is divergent for all

x \neq 0

if

p > q + 1

, while, if one of

{a_{k}}_{k = 1}^{p}

is a negative integer, it will reduce to a hypergeometric polynomial that is convergent for all real x.

Theorem 11.

Let

{{\bar{P}}_{n} (x, p; z (x))}

be a monic p-UPS with respect to the fixed function

z (x)

, and let

{\bar{Q}}_{m} (x)

be an arbitrary monic polynomial of degree

m \leq n

. Then,

{cov}_{p} ({\bar{Q}}_{m} (x), {\bar{P}}_{n} (x, p; z (x)); z (x)) = {var}_{p} ({\bar{P}}_{n} (x, p; z (x)); z (x)) δ_{m, n} .

Proof.

Since

{\bar{P}}_{n} (x, p; z (x))

is a p-UPS, its elements are linearly independent according to Theorem 1. Hence,

{{\bar{P}}_{k} (x, p; z (x))}_{k = 0}^{m}

is a basis for all polynomials of degree at most m and, for any arbitrary monic polynomial of degree m, say

{\bar{Q}}_{m} (x)

, there exist constants

{b_{k}}

, such that

{\bar{Q}}_{m} (x) = \sum_{k = 0}^{m} b_{k} {\bar{P}}_{k} (x, p; z (x)) with b_{m} = 1 .

By the linearity of p-covariances with respect to

z (x)

and using (166), we get

\begin{matrix} {cov}_{p} ({\bar{Q}}_{m} (x), {\bar{P}}_{n} (x, p; z (x)); z (x)) & = \sum_{k = 0}^{m} b_{k} {cov}_{p} ({\bar{P}}_{k} (x, p; z (x)), {\bar{P}}_{n} (x, p; z (x)); z (x)) \\ = \{\begin{matrix} 0 if m < n, \\ {var}_{p} ({\bar{P}}_{n} (x, p; z (x)); z (x)) if m = n . \end{matrix} □ \end{matrix}

(168)

For the monomial basis

{\{V_{k} = x^{k}\}}_{k = 0}^{n}

, the elements of the main determinant (40), i.e.,

Δ_{n}^{(p)} ({x^{k}}_{k = 0}^{n}; z (x)) = |\begin{matrix} \begin{matrix} {var}_{p} (1; z (x)) \\ {cov}_{p} (x, 1; z (x)) \\ ⋮ \\ {cov}_{p} (x^{n}, 1; z (x)) \end{matrix} & \begin{matrix} {cov}_{p} (1, x; z (x)) \\ {var}_{p} (x; z (x)) \\ ⋮ \\ {cov}_{p} (x^{n}, x; z (x)) \end{matrix} & \begin{matrix} \dots \\ \dots \\ ⋮ \\ \dots \end{matrix} & \begin{matrix} {cov}_{p} (1, x^{n}; z (x)) \\ {cov}_{p} (x, x^{n}; z (x)) \\ ⋮ \\ {var}_{p} (x^{n}; z (x)) \end{matrix} \end{matrix}|,

(169)

can be expressed in terms of a special case of moments that are related to

z (x)

directly. In other words, if we define

μ_{n} (z (x)) = (\int_{a}^{b} w (x) d x) E (x^{n} z (x)) = \int_{a}^{b} w (x) x^{n} z (x) d x,

(170)

then the entries of (169) can be represented in terms of definition (170) as follows

{cov}_{p} (x^{i}, x^{j}; z (x)) = \frac{μ_{i + j} (1)}{μ_{0} (1)} - p \frac{μ_{i} (z (x)) μ_{j} (z (x))}{μ_{0} (1) μ_{0} (z^{2} (x))} .

Theorem 12.

A necessary condition for the existence of a monic p-UPS with respect to the function

z (x)

is that

Δ_{n}^{(p)} ({x^{k}}_{k = 0}^{n}; z (x)) \neq 0

.

Proof.

Suppose

{\bar{P}}_{n} (x, p; z (x)) = \sum_{k = 0}^{n} c_{k}^{(n)} (p) x^{k}

is a monic p-UPS with

c_{n}^{(n)} (p) = 1

. Referring to (168), for

m \leq n

and

{\bar{Q}}_{m} (x) = x^{m}

, we observe that the relations

{cov}_{p} (x^{m}, {\bar{P}}_{n} (x, p; z (x)); z (x)) = \sum_{k = 0}^{n} c_{k}^{(n)} (p) {cov}_{p} (x^{m}, x^{k}; z (x)) = {var}_{p} ({\bar{P}}_{n} (x, p; z (x)); z (x)) δ_{m, n},

will at last lead to the following solvable linear system:

\begin{matrix} [\begin{matrix} \begin{matrix} {var}_{p} (1; z (x)) \\ {cov}_{p} (x, 1; z (x)) \\ ⋮ \\ {cov}_{p} (x^{n}, 1; z (x)) \end{matrix} & \begin{matrix} {cov}_{p} (1, x; z (x)) \\ {var}_{p} (x; z (x)) \\ ⋮ \\ {cov}_{p} (x^{n}, x; z (x)) \end{matrix} & \begin{matrix} \dots \\ \dots \\ ⋮ \\ \dots \end{matrix} & \begin{matrix} {cov}_{p} (1, x^{n}; z (x)) \\ {cov}_{p} (x, x^{n}; z (x)) \\ ⋮ \\ {var}_{p} (x^{n}; z (x)) \end{matrix} \end{matrix}] [\begin{matrix} c_{0}^{(n)} (p) \\ c_{1}^{(n)} (p) \\ ⋮ \\ c_{n}^{(n)} (p) \end{matrix}] \\ = [\begin{matrix} 0 \\ 0 \\ ⋮ \\ {var}_{p} ({\bar{P}}_{n} (x, p; z (x)); z (x)) \end{matrix}], \end{matrix}

which obviously implies that

Δ_{n}^{(p)} ({x^{k}}_{k = 0}^{n}; z (x)) \neq 0

. □

One of the consequences of solving the above system is that

{var}_{p} ({\bar{P}}_{n} (x, p; z (x)); z (x)) = \frac{Δ_{n}^{(p)} ({x^{k}}_{k = 0}^{n}; z (x))}{Δ_{n - 1}^{(p)} ({x^{k}}_{k = 0}^{n}; z (x))} (n \geq 1),

(171)

which is valid for

n = 0

if we set

Δ_{- 1}^{(p)} (.) = 1

.

Another consequence is a general representation for the non-monic case of uncorrelated polynomials as

P_{n} (x, p; z (x)) = |\begin{matrix} {var}_{p} (1; z (x)) & {cov}_{p} (1, x; z (x)) & \dots & {cov}_{p} (1, x^{n}; z (x)) \\ {cov}_{p} (x, 1; z (x)) & {var}_{p} (x; z (x)) & \dots & {cov}_{p} (x, x^{n}; z (x)) \\ ⋮ & ⋮ & ⋮ & ⋮ \\ {cov}_{p} (x^{n - 1}, 1; z (x)) & {cov}_{p} (x^{n - 1}, x; z (x)) & \dots & {cov}_{p} (x^{n - 1}, x^{n}; z (x)) \\ 1 & x & \dots & x^{n} \end{matrix}|

(172)

Remark 5.

For

p = 1

and

z (x) = x^{λ}

, since

{cov}_{1} (x^{i}, x^{j}; x^{λ}) = 0 \Leftrightarrow λ = i or λ = j,

the rows of determinant (172) show that

z (x)

should conditionally be chosen as a member of the monomial basis

{x^{k}}_{k = 0}

. For example, if

λ = 5

, the finite polynomial set

{P_{n} (x, 1; x^{5})}_{n = 0}^{5}

is finitely 1-uncorrelated with respect to

z (x) = x^{5}

. The determinant (172) also shows that for the monic type of the polynomials we always have

{\bar{P}}_{n} (x, 1; x^{n}) = x^{n}

, e.g.,

{\bar{P}}_{5} (x, 1; x^{5}) = x^{5}

.

Theorem 13.

Let

{P_{n} (x, p; z (x))}

be a (weighted) p-UPS with respect to the function

z (x)

. For any arbitrary polynomial

Q_{n} (x) = \sum_{k = 0}^{n} q_{k} x^{k}

of degree n, we have

\begin{matrix} {cov}_{p} (Q_{n} (x), P_{n} (x, p; z (x)); z (x)) & = q_{n} {cov}_{p} (x^{n}, P_{n} (x, p; z (x)); z (x)) \\ = q_{n} c_{n}^{(n)} (p) \frac{Δ_{n}^{(p)} ({x^{k}}_{k = 0}^{n}; z (x))}{Δ_{n - 1}^{(p)} ({x^{k}}_{k = 0}^{n}; z (x))}, \end{matrix}

where

c_{n}^{(n)} (p)

denotes the leading coefficient of

P_{n} (x, p; z (x))

.

Proof.

Suppose

Q_{n} (x) = q_{n} x^{n} + Q_{n - 1}^{*} (x)

, in which

Q_{n - 1}^{*} (x)

is a polynomial of degree at most

n - 1

. In this case,

\begin{matrix} {cov}_{p} (Q_{n} (x), P_{n} (x, p; z (x)); z (x)) = q_{n} {cov}_{p} (x^{n}, P_{n} (x, p; z (x)); z (x)) \\ + {cov}_{p} (Q_{n - 1}^{*} (x), P_{n} (x, p; z (x)); z (x)) = q_{n} {cov}_{p} (x^{n}, P_{n} (x, p; z (x)); z (x)), \end{matrix}

and the proof is completed by noting (171) and Theorem 12. □

7.1. A Generic Recurrence Relation for p-UPS

Let

{\bar{Q}}_{m} (x)

be an arbitrary monic polynomial of degree m, and let

{{\bar{P}}_{n} (x, p; z (x))}

be a monic p-UPS with respect to the function

z (x)

. According to Theorem 11 and relation (168), we have

{\bar{Q}}_{m} (x) = \sum_{k = 0}^{m} \frac{{cov}_{p} ({\bar{Q}}_{m} (x), {\bar{P}}_{k} (x, p; z (x)); z (x))}{{var}_{p} ({\bar{P}}_{k} (x, p; z (x)); z (x))} {\bar{P}}_{k} (x, p; z (x)) .

(173)

Now, replacing

m = n + 1

and

{\bar{Q}}_{n + 1} (x) = x {\bar{P}}_{n} (x, p; z (x))

in (173) as

x {\bar{P}}_{n} (x, p; z (x)) = \sum_{k = 0}^{n + 1} \frac{{cov}_{p} (x {\bar{P}}_{n} (x, p; z (x)), {\bar{P}}_{k} (x, p; z (x)); z (x))}{{var}_{p} ({\bar{P}}_{k} (x, p; z (x)); z (x))} {\bar{P}}_{k} (x, p; z (x)),

(174)

gives a

n + 2

term recurrence relation for the sequence

{{\bar{P}}_{n} (x, p; z (x))}

.

In general, since

\begin{matrix} {cov}_{p} (f (x) h (x), g (x); z (x)) = {cov}_{p} (f (x), h (x) g (x); z (x)) \\ + p \frac{E (f (x) z (x)) E (h (x) g (x) z (x)) - E (g (x) z (x)) E (f (x) h (x) z (x))}{E (z^{2} (x))}, \end{matrix}

the coefficients of (174) can be changed in the form

\begin{matrix} {cov}_{p} (x {\bar{P}}_{n} (x, p; z (x)), {\bar{P}}_{k} (x, p; z (x)); z (x)) = {cov}_{p} ({\bar{P}}_{n} (x, p; z (x)), x {\bar{P}}_{k} (x, p; z (x)); z (x)) \\ + p \frac{E (x {\bar{P}}_{k} (x, p; z (x)) z (x)) E ({\bar{P}}_{n} (x, p; z (x)) z (x)) - E (x {\bar{P}}_{n} (x, p; z (x)) z (x)) E ({\bar{P}}_{k} (x, p; z (x)) z (x))}{E (z^{2} (x))} . \end{matrix}

Note in the above formula that

{cov}_{p} ({\bar{P}}_{n} (x, p; z (x)), x {\bar{P}}_{k} (x, p; z (x)); z (x)) = 0 \Leftrightarrow k + 1 < n .

Accordingly, if

p = 0

or

E (z (x) {\bar{P}}_{k} (x, p; z (x))) = 0

in (174), it will reduce to the same as celebrated three term recurrence relation of monic orthogonal polynomials [25]. See also relations (201) and (202) in this regard.

7.2. A Complete Uncorrelated Sequence of Hypergeometric Polynomials of ${}_{4}F_{3}$ Type

In this section, we introduce a complete uncorrelated hypergeometric polynomial that reveals the importance of the role of a non-constant fixed function. Let

w (x) = 1

be defined on

[0, 1]

and

z (x) = x^{r}

for

r > - 1 / 2

as

\int_{0}^{1} w (x) z^{2} (x) d x = \frac{1}{2 r + 1}

. Then, the components of determinant (172) are explicitly computed as

\begin{matrix} {cov}_{p} (x^{i}, x^{j}; x^{r}) = \frac{(r + i + 1) (r + j + 1) - p (2 r + 1) (i + j + 1)}{(r + i + 1) (r + j + 1) (i + j + 1)} \\ for i = 0, 1, \dots, n - 1 and j = 0, 1, \dots, n . \end{matrix}

(175)

A few samples of the monic type of the corresponding p-uncorrelated polynomials are

\begin{matrix} {\bar{P}}_{0} (x, p; x^{r}) & = 1, \\ {\bar{P}}_{1} (x, p; x^{r}) & = x - \frac{r + 1}{2 (r + 2)} \frac{(r + 1) (r + 2) - 2 p (2 r + 1)}{{(r + 1)}^{2} - p (2 r + 1)}, \\ {\bar{P}}_{2} (x, p; x^{r}) & = x^{2} - \frac{r + 2}{r + 3} \frac{{(r + 1)}^{2} (r + 2) (r + 3) - p (2 r + 1) (5 r^{2} + 5 r + 6)}{{(r + 1)}^{2} {(r + 2)}^{2} - 4 p (2 r + 1) (r^{2} + r + 1)} x \\ + \frac{r + 1}{6 (r + 3)} \frac{(r + 1) {(r + 2)}^{2} (r + 3) - 6 p (2 r + 1) (r^{2} + r + 2)}{{(r + 1)}^{2} {(r + 2)}^{2} - 4 p (2 r + 1) (r^{2} + r + 1)}, \end{matrix}

satisfying the uncorrelatedness condition

\int_{0}^{1} {\bar{P}}_{n} (x, p; x^{r}) {\bar{P}}_{m} (x, p; x^{r}) d x = p (2 r + 1) \int_{0}^{1} x^{r} {\bar{P}}_{n} (x, p; x^{r}) d x \int_{0}^{1} x^{r} {\bar{P}}_{m} (x, p; x^{r}) d x \Leftrightarrow n \neq m .

For

p = 0

, the above samples are in fact the monic type of the well-known shifted Legendre polynomials

P_{n} (x, 0; x^{r}) = P_{n, +} (x) = {}_{2}F_{1} (\begin{matrix} - n, n + 1 \\ 1 \end{matrix}| x)

, having the orthogonal property [25]

\int_{0}^{1} P_{n, +} (x) P_{m, +} (x) d x = \frac{1}{2 n + 1} δ_{n, m},

while, for the optimized case

p = 1

, relation (175) is first simplified as

\begin{matrix} {cov}_{1} (x^{i}, x^{j}; x^{r}) = \frac{(r - i) (r - j)}{(r + i + 1) (r + j + 1) (i + j + 1)} \\ for i = 0, 1, \dots, n - 1 and j = 0, 1, \dots, n . \end{matrix}

(176)

If (176) is directly substituted into (172), with the aid of advanced mathematical software, see, e.g., [26,27], we can eventually find the explicit form of the polynomials as

P_{n} (x, 1; x^{r}) = {}_{4}F_{3} (\begin{matrix} - n, n + 1, - r, r + 2 \\ 1, - r + 1, r + 1 \end{matrix}| x),

(177)

satisfying the complete uncorrelatedness condition

\begin{matrix} \int_{0}^{1} P_{n} (x, 1; x^{r}) P_{m} (x, 1; x^{r}) d x - (2 r + 1) \int_{0}^{1} x^{r} P_{n} (x, 1; x^{r}) d x \int_{0}^{1} x^{r} P_{m} (x, 1; x^{r}) d x \\ = (\int_{0}^{1} P_{n}^{2} (x, 1; x^{r}) d x - (2 r + 1) {(\int_{0}^{1} x^{r} P_{n} (x, 1; x^{r}) d x)}^{2}) δ_{m, n}, \end{matrix}

(178)

where

\begin{matrix} \int_{0}^{1} x^{r} P_{n} (x, 1; x^{r}) d x & = \sum_{k = 0}^{n} \frac{{(- n)}_{k} {(n + 1)}_{k} {(- r)}_{k} {(r + 2)}_{k}}{{(1)}_{k} {(- r + 1)}_{k} {(r + 1)}_{k} k! (r + 1 + k)} \\ = \frac{1}{r + 1} {}_{2}F_{3} (\begin{matrix} - n, n + 1, - r \\ 1, - r + 1 \end{matrix}| 1) = \frac{1}{r + 1} \frac{{(1 + r)}_{n}}{{(1 - r)}_{n}}, \end{matrix}

(179)

and

\begin{matrix} \int_{0}^{1} P_{n} (x, 1; x^{r}) P_{m} (x, 1; x^{r}) d x \\ = \sum_{k = 0}^{n} \frac{{(- n)}_{k} {(n + 1)}_{k} {(- r)}_{k} {(r + 2)}_{k}}{{(2)}_{k} {(- r + 1)}_{k} {(r + 1)}_{k} k!} {}_{5}F_{4} (\begin{matrix} - m, m + 1, - r, r + 2, k + 1 \\ 1, - r + 1, r + 1, k + 2 \end{matrix}| 1), \end{matrix}

which simplifies the left side of (178) as

\begin{matrix} \int_{0}^{1} P_{n} (x, 1; x^{r}) P_{m} (x, 1; x^{r}) d x - (2 r + 1) \int_{0}^{1} x^{r} P_{n} (x, 1; x^{r}) d x \int_{0}^{1} x^{r} P_{m} (x, 1; x^{r}) d x \\ = \sum_{k = 0}^{n} \frac{{(- n)}_{k} {(n + 1)}_{k} {(- r)}_{k} {(r + 2)}_{k}}{{(2)}_{k} {(- r + 1)}_{k} {(r + 1)}_{k} k!} {}_{5}F_{4} (\begin{matrix} - m, m + 1, - r, r + 2, k + 1 \\ 1, - r + 1, r + 1, k + 2 \end{matrix}| 1) \\ - \frac{2 r + 1}{{(r + 1)}^{2}} \frac{{(1 + r)}_{n} {(1 + r)}_{m}}{{(1 - r)}_{n} {(1 - r)}_{m}} . \end{matrix}

(180)

Of course, the result of (179) has been derived using the following technique. Since

H (n) = {}_{3}F_{2} (\begin{matrix} - n, n + 1, - r \\ 1, - r + 1 \end{matrix}| 1),

satisfies the first order recurrence equation

H (n + 1) = \frac{n + 1 + r}{n + 1 - r} H (n),

so

{}_{3}F_{2} (\begin{matrix} - n, n + 1, - r \\ 1, - r + 1 \end{matrix}| 1) = \frac{{(1 + r)}_{n}}{{(1 - r)}_{n}} .

This technique can similarly be used to obtain

{}_{5}F_{4} (.)

in (180). First, we can find out that the sequence

S (m) = {}_{5}F_{4} (\begin{matrix} - m, m + 1, - r, r + 2, k + 1 \\ 1, - r + 1, r + 1, k + 2 \end{matrix}| 1),

satisfies the second-order recurrence equation

\begin{matrix} (m + 1) (m + 2 - r) (m + 3 + k) S (m + 2) \\ - (2 m + 3) ((m + 1) (m + 2) + r (k + 1)) S (m + 1) + (m + 2) (m + 1 + r) (m - k) S (m) = 0, \end{matrix}

having two independent solutions

S_{1} (m) = \frac{(m + r) {(r)}_{m}}{(m - r) {(- r)}_{m}} and S_{2} (m) = \frac{{(- k)}_{m}}{(k + m + 1) (k + m) {(k)}_{m}} .

It is clear that

S (m) = A S_{1} (m) + B S_{2} (m) .

(181)

To compute the unknown coefficients A and B, it is sufficient to replace

m = 0, 1

, respectively, in (181) to finally get

\begin{matrix} {}_{5}F_{4} (\begin{matrix} - m, m + 1, - r, r + 2, k + 1 \\ 1, - r + 1, r + 1, k + 2 \end{matrix}| 1) \\ = - \frac{k + 1}{(k + r + 1) (r + 1)} (\frac{(2 r + 1) (m + r) {(r)}_{m}}{(m - r) {(- r)}_{m}} + \frac{r k (k - r) {(- k)}_{m}}{(k + m + 1) (k + m) {(k)}_{m}}) . \end{matrix}

(182)

Thus, in order to compute

{var}_{1} (P_{n} (x, 1; x^{r}); x^{r}) = \int_{0}^{1} P_{n}^{2} (x, 1; x^{r}) d x - (2 r + 1) {(\int_{0}^{1} x^{r} P_{n} (x, 1; x^{r}) d x)}^{2},

we first suppose in (180) that

m = n

and then refer to (182) to arrive at

\begin{matrix} S^{*} = \sum_{k = 0}^{n} \frac{{(- n)}_{k} {(n + 1)}_{k} {(- r)}_{k} {(r + 2)}_{k}}{{(2)}_{k} {(- r + 1)}_{k} {(r + 1)}_{k} k!} {}_{5}F_{4} (\begin{matrix} - n, n + 1, - r, r + 2, k + 1 \\ 1, - r + 1, r + 1, k + 2 \end{matrix}| 1) \\ = - \frac{(2 r + 1) (n + r) {(r)}_{n}}{(r + 1) (n - r) {(- r)}_{n}} \sum_{k = 0}^{n} \frac{{(- n)}_{k} {(n + 1)}_{k} {(- r)}_{k} {(r + 2)}_{k}}{{(2)}_{k} {(- r + 1)}_{k} {(r + 1)}_{k} k!} \frac{k + 1}{k + r + 1} \\ - \frac{r}{r + 1} \sum_{k = 0}^{n} \frac{{(- n)}_{k} {(n + 1)}_{k} {(- r)}_{k} {(r + 2)}_{k}}{{(2)}_{k} {(- r + 1)}_{k} {(r + 1)}_{k} k!} \frac{k + 1}{k + r + 1} \frac{k (k - r) {(- k)}_{n}}{(k + n + 1) (k + n) {(k)}_{n}} . \end{matrix}

(183)

Since, in general,

\frac{{(a)}_{k}}{{(a + 1)}_{k}} = \frac{a}{a + k},

relation (183) is simplified as

\begin{matrix} S^{*} & = - \frac{(2 r + 1) (n + r) {(r)}_{n}}{{(r + 1)}^{2} (n - r) {(- r)}_{n}} \frac{{(1 + r)}_{n}}{{(1 - r)}_{n}} + \frac{r^{2}}{{(r + 1)}^{2} (n + 1)!} \sum_{k = 0}^{n} \frac{{(- k)}_{n} {(n + 1)}_{k} {(- n)}_{k}}{{(1)}_{k} {(n + 2)}_{k}} \\ = - \frac{(2 r + 1) (n + r) {(r)}_{n}}{{(r + 1)}^{2} (n - r) {(- r)}_{n}} \frac{{(1 + r)}_{n}}{{(1 - r)}_{n}} + \frac{r^{2}}{{(r + 1)}^{2} (n + 1)!} \frac{n! {(n + 1)}_{n}}{{(n + 2)}_{n}} . \end{matrix}

(184)

Noting that

{(- k)}_{n} = 0

for any

k < n

in (184), the following final result will be derived.

Corollary 5.

If

r > - 1 / 2

and

r \notin Z^{+}

, then

\int_{0}^{1} P_{n}^{2} (x, 1; x^{r}) d x - (2 r + 1) {(\int_{0}^{1} x^{r} P_{n} (x, 1; x^{r}) d x)}^{2} = \frac{r^{2}}{{(r + 1)}^{2}} \frac{1}{2 n + 1} .

Moreover, (180) is simplified as

\begin{matrix} \int_{0}^{1} P_{n} (x, 1; x^{r}) P_{m} (x, 1; x^{r}) d x - (2 r + 1) \int_{0}^{1} x^{r} P_{n} (x, 1; x^{r}) d x \int_{0}^{1} x^{r} P_{m} (x, 1; x^{r}) d x \\ = \frac{r^{2}}{{(r + 1)}^{2} (m + 1)!} \sum_{k = 0}^{n} \frac{{(n + 1)}_{k} {(- n)}_{k}}{{(m + 2)}_{k}} \frac{{(- k)}_{m}}{k!} = 0 \Leftrightarrow m \neq n, \end{matrix}

which means

\sum_{k = m}^{n} \frac{{(n + 1)}_{k} {(- n)}_{k}}{{(m + 2)}_{k}} \frac{{(- k)}_{m}}{k!} = \frac{(n + 1)!}{2 n + 1} δ_{n, m} .

Remark 6.

It is interesting to know that

lim_{r \to \infty} P_{n} (x, 1; x^{r}) = lim_{r \to \infty} {}_{4}F_{3} (\begin{matrix} - n, n + 1, - r, r + 2 \\ 1, - r + 1, r + 1 \end{matrix}| x) = {}_{2}F_{1} (\begin{matrix} - n, n + 1 \\ 1 \end{matrix}| x) = P_{n, +} (x),

and from Corollary 5, we subsequently conclude that

\begin{matrix} lim_{r \to \infty} \frac{r^{2}}{{(r + 1)}^{2}} \frac{1}{2 n + 1} = lim_{r \to \infty} (\int_{0}^{1} P_{n}^{2} (x, 1; x^{r}) d x - (2 r + 1) {(\int_{0}^{1} x^{r} P_{n} (x, 1; x^{r}) d x)}^{2}) \\ = \int_{0}^{1} lim_{r \to \infty} P_{n}^{2} (x, 1; x^{r}) d x - lim_{r \to \infty} (2 r + 1) {(\frac{1}{r + 1} \frac{{(1 + r)}_{n}}{{(1 - r)}_{n}})}^{2} = \int_{0}^{1} P_{n, +}^{2} (x) d x = \frac{1}{2 n + 1} . \end{matrix}

Remark 7.

Instead of

z (x) = x^{r}

, if we take

z (x) = {(1 - x)}^{r}

defined on

[0, 1]

, (178) changes to

\begin{matrix} \int_{0}^{1} P_{n} (1 - x, 1; {(1 - x)}^{r}) P_{m} (1 - x, 1; {(1 - x)}^{r}) d x \\ - (2 r + 1) \int_{0}^{1} {(1 - x)}^{r} P_{n} (1 - x, 1; {(1 - x)}^{r}) d x \int_{0}^{1} {(1 - x)}^{r} P_{m} (1 - x, 1; {(1 - x)}^{r}) d x \\ = \frac{r^{2}}{{(r + 1)}^{2}} \frac{1}{2 n + 1} δ_{m, n} . \end{matrix}

Using Corollary 5, we can now establish an optimized approximation (or expansion as

n \to \infty

) of polynomial type for any expandable function, say

f (x)

, as follows:

f (x) ≅ \frac{{(r + 1)}^{2}}{r^{2}} \sum_{k = 0}^{n} (2 k + 1) {cov}_{1} (f (x), P_{k} (x, 1; x^{r}); x^{r}) P_{k} (x, 1; x^{r}),

whose error 1-variance is minimized with respect to the fixed function

z (x) = x^{r}

on

[0, 1]

.

Furthermore, if

f (x)

is of a polynomial type, the approximate operator in the above relation will change to an equality according to the finite type Theorem 4 and its important consequences in Remark 1. For example, since

P_{0} (x, 1; x^{r}) = 1

,

P_{1} (x, 1; x^{r}) = - 2 \frac{r (r + 2)}{r^{2} - 1} x + 1

and

P_{2} (x, 1; x^{r}) = 6 \frac{r (r + 3)}{(r - 2) (r + 1)} x^{2} - 6 \frac{r (r + 2)}{r^{2} - 1} x + 1,

every arbitrary polynomial of degree 2, say

a x^{2} + b x + c

, can be expanded in terms of the above samples as follows:

\begin{matrix} a x^{2} + b x + c = \frac{{(r + 1)}^{2}}{r^{2}} {cov}_{1} (a x^{2} + b x + c, P_{0} (x, 1; x^{r}); x^{r}) \\ + 3 \frac{{(r + 1)}^{2}}{r^{2}} {cov}_{1} (a x^{2} + b x + c, P_{1} (x, 1; x^{r}); x^{r}) (- 2 \frac{r (r + 2)}{r^{2} - 1} x + 1) \\ + 5 \frac{{(r + 1)}^{2}}{r^{2}} {cov}_{1} (a x^{2} + b x + c, P_{2} (x, 1; x^{r}); x^{r}) (6 \frac{r (r + 3)}{(r - 2) (r + 1)} x^{2} - 6 \frac{r (r + 2)}{r^{2} - 1} x + 1), \end{matrix}

where

\begin{matrix} {cov}_{1} (a x^{2} + b x + c, P_{0} (x, 1; x^{r}); x^{r}) = \frac{1}{3} \frac{r (r - 2)}{(r + 1) (r + 3)} a + \frac{1}{2} \frac{r (r - 1)}{(r + 1) (r + 2)} b + \frac{r^{2}}{{(r + 1)}^{2}} c, \\ {cov}_{1} (a x^{2} + b x + c, P_{1} (x, 1; x^{r}); x^{r}) = - \frac{1}{6} \frac{r (r - 2)}{(r + 1) (r + 3)} a - \frac{1}{6} \frac{r (r - 1)}{(r + 1) (r + 2)} b, \end{matrix}

and

{cov}_{1} (a x^{2} + b x + c, P_{2} (x, 1; x^{r}); x^{r}) = \frac{1}{30} \frac{r (r - 2)}{(r + 1) (r + 3)} a .

According to Remark 2, since the uncorrelated polynomials (177) are now explicitly known, a new sequence of orthogonal functions can be constructed as follows:

G_{n} (x, 1; r) = {}_{4}F_{3} (\begin{matrix} - n, n + 1, - r, r + 2 \\ 1, - r + 1, r + 1 \end{matrix}| x) - \frac{2 r + 1}{r + 1} \frac{{(1 + r)}_{n}}{{(1 - r)}_{n}} x^{r},

satisfying the orthogonality relation

\int_{0}^{1} G_{n} (x, 1; r) G_{m} (x, 1; r) d x = \frac{r^{2}}{{(r + 1)}^{2}} \frac{1}{2 n + 1} δ_{n, m} .

The above orthogonality is valid for every n if and only if

r > - 1 / 2

and

r \notin Z^{+}

. However, if

r = m \in N

, the polynomial set

{G_{n} (x, 1; m)}_{n = 0}^{m - 1}

will be finitely orthogonal on

[0, 1]

according to Remark 5. For instance, for

m = 3

the elements of the finite set

{G_{n} (x, 1; 3)}_{n = 0}^{2} = {- \frac{7}{4} x^{3} + 1, \frac{7}{2} x^{3} - \frac{15}{4} x + 1, - \frac{35}{2} x^{3} + 27 x^{2} - \frac{45}{4} x + 1},

are mutually orthogonal with respect to the constant weight function

w (x) = 1

on

[0, 1]

having the norm square value

\frac{9}{16} {\frac{1}{2 n + 1}}_{n = 0}^{2}

.

8. On the Ordinary Case of p-Covariances

When

Z = c^{*} \neq 0

and

p = 1

, we have

{cov}_{1} (X, Y; c^{*}) = cov (X, Y)

. Replacing these assumptions in (43) as

X_{n} = |\begin{matrix} var (V_{0}) & cov (V_{0}, V_{1}) & \dots & cov (V_{0}, V_{n}) \\ cov (V_{1}, V_{0}) & var (V_{1}) & \dots & cov (V_{1}, V_{n}) \\ ⋮ & ⋮ & ⋮ & ⋮ \\ cov (V_{n - 1}, V_{0}) & cov (V_{n - 1}, V_{1}) & \dots & cov (V_{n - 1}, V_{n}) \\ V_{0} & V_{1} & \dots & V_{n} \end{matrix}|,

(185)

generates an ordinary uncorrelated element. Note that the leading coefficient in (43) has no effect on being uncorrelated, and we have therefore ignored it in the above determinant.

There are, in total, two cases for the initial value

V_{0}

in (185), i.e., when

V_{0}

is a constant value (preferably

V_{0} = 1

) or it is not constant. As we have shown in Section 2.3.2, the first case leads to the approved result (26), whereas the non-constant case contains some novel results. The next subsection describes one of them.

Another Uncorrelated Sequence of Hypergeometric Polynomials of ${}_{4}F_{3}$ Type

Let

w (x) = 1

be defined on

[0, 1]

, and consider the initial data

{V_{k} = x^{r + k}}_{k = 0}^{n}

for

r \neq 0

because

V_{0} \neq 1

. In this case, the components corresponding to (185) are computed as

\begin{matrix} cov (x^{r + i}, x^{r + j}) = \frac{(r + i) (r + j)}{(r + i + 1) (r + j + 1) (2 r + i + j + 1)} \\ for i = 0, 1, \dots, n - 1 and j = 0, 1, \dots, n . \end{matrix}

(186)

Since

var (x^{r + j}) = \frac{{(r + j)}^{2}}{{(r + j + 1)}^{2} (2 r + 1 + 2 j)} > 0,

for

j = 0

, we must also add the condition

2 r + 1 > 0

. If (186) is directly substituted into (185), with the aid of advanced mathematical software, we can again find the explicit form of the functions, say

X_{n} = f_{r + n} (x; r)

, as

f_{r + n} (x; r) = x^{r} Q_{n} (x; r) = x^{r} {}_{4}F_{3} (\begin{matrix} - n, n + 2 r + 1, r, r + 2 \\ 2 r + 1, r + 1, r + 1 \end{matrix}| x),

(187)

satisfying the ordinary uncorrelatedness condition

\begin{matrix} \int_{0}^{1} f_{r + n} (x; r) f_{r + m} (x; r) d x - \int_{0}^{1} f_{r + n} (x; r) d x \int_{0}^{1} f_{r + m} (x; r) d x \\ = (\int_{0}^{1} f_{r + n}^{2} (x; r) d x - {(\int_{0}^{1} f_{r + n} (x; r) d x)}^{2}) δ_{m, n}, \end{matrix}

which is equivalent to

\begin{matrix} \int_{0}^{1} x^{2 r} Q_{n} (x; r) Q_{m} (x; r) d x - \int_{0}^{1} x^{r} Q_{n} (x; r) d x \int_{0}^{1} x^{r} Q_{m} (x; r) d x \\ = (\int_{0}^{1} x^{2 r} Q_{n}^{2} (x; r) d x - {(\int_{0}^{1} x^{r} Q_{n} (x; r) d x)}^{2}) δ_{m, n} . \end{matrix}

(188)

Remark 8.

If, in general,

P_{r} (X = x) = u (x) / \int_{a}^{b} u (x) d x

, one can find a relationship between ordinary covariances and 1-covariances as follows:

\begin{matrix} (\int_{a}^{b} w (x) d x) {cov (z (x) f (x), z (x) g (x))|}_{u (x) = w (x)} \\ = (\int_{a}^{b} w (x) z^{2} (x) d x) {{cov}_{1} (f (x), g (x); \frac{1}{z (x)})|}_{u (x) = w (x) z^{2} (x)} . \end{matrix}

(189)

Hence, (188) shows that, if

z (x) = x^{r}

and

w (x) = 1

are replaced in (189), then

{cov (x^{r} Q_{n} (x; r), x^{r} Q_{m} (x; r))|}_{u (x) = 1} = \frac{1}{2 r + 1} {{cov}_{1} (Q_{n} (x; r), Q_{m} (x; r); x^{- r})|}_{u (x) = x^{2 r}} .

In the sequel, in (188),

\begin{matrix} \int_{0}^{1} x^{r} Q_{n} (x; r) d x & = \sum_{k = 0}^{n} \frac{{(- n)}_{k} {(n + 2 r + 1)}_{k} {(r)}_{k} {(r + 2)}_{k}}{{(2 r + 1)}_{k} {(r + 1)}_{k} {(r + 1)}_{k} k! (r + 1 + k)} \\ = \frac{1}{r + 1} {}_{3}F_{2} (\begin{matrix} - n, n + 2 r + 1, r \\ 2 r + 1, r + 1 \end{matrix}| 1) = \frac{1}{r + 1} \frac{n!}{{(2 r + 1)}_{n}}, \end{matrix}

(190)

and

\begin{matrix} \int_{0}^{1} x^{2 r} Q_{n} (x; r) Q_{m} (x; r) d x \\ = \frac{1}{2 r + 1} \sum_{k = 0}^{n} \frac{{(- n)}_{k} {(n + 2 r + 1)}_{k} {(r)}_{k} {(r + 2)}_{k}}{{(2 r + 2)}_{k} {(r + 1)}_{k} {(r + 1)}_{k} k!} {}_{5}F_{4} (\begin{matrix} - m, m + 2 r + 1, r, r + 2, 2 r + 1 + k \\ 2 r + 1, r + 1, r + 1, 2 r + 2 + k \end{matrix}| 1), \end{matrix}

yielding

\begin{matrix} \int_{0}^{1} x^{2 r} Q_{n} (x; r) Q_{m} (x; r) d x - \int_{0}^{1} x^{r} Q_{n} (x; r) d x \int_{0}^{1} x^{r} Q_{m} (x; r) d x \\ = \frac{1}{2 r + 1} \sum_{k = 0}^{n} \frac{{(- n)}_{k} {(n + 2 r + 1)}_{k} {(r)}_{k} {(r + 2)}_{k}}{{(2 r + 2)}_{k} {(r + 1)}_{k} {(r + 1)}_{k} k!} {}_{5}F_{4} (\begin{matrix} - m, m + 2 r + 1, r, r + 2, 2 r + 1 + k \\ 2 r + 1, r + 1, r + 1, 2 r + 2 + k \end{matrix}| 1) \\ - \frac{1}{{(r + 1)}^{2}} \frac{n! m!}{{(2 r + 1)}_{n} {(2 r + 1)}_{m}} . \end{matrix}

(191)

To derive (190), we have used a similar computational technique as follows. Since

A (n) = {}_{3}F_{2} (\begin{matrix} - n, n + 2 r + 1, r \\ 2 r + 1, r + 1 \end{matrix}| 1),

satisfies the first-order equation

A (n + 1) = \frac{n + 1}{n + 1 + 2 r} A (n),

so

{}_{3}F_{2} (\begin{matrix} - n, n + 2 r + 1, r \\ 2 r + 1, r + 1 \end{matrix}| 1) = \frac{n!}{{(2 r + 1)}_{n}} .

Also, to compute the hypergeometric term in (191), since

B (m) = {}_{5}F_{4} (\begin{matrix} - m, m + 2 r + 1, r, r + 2, 2 r + 1 + k \\ 2 r + 1, r + 1, r + 1, 2 r + 2 + k \end{matrix}| 1),

satisfies the recurrence relation

\begin{matrix} - (m + 2 r + 1) (m + 2 r + 2) (m + 2 r + 3 + k) B (m + 2) \\ + (m + 2) (2 m + 2 r + 3) (m + 2 r + 1) B (m + 1) + (m + 1) (m + 2) (k - m) B (m) = 0, \end{matrix}

having two independent solutions

B_{1} (m) = \frac{m!}{{(2 r + 1)}_{m}} and B_{2} (m) = \frac{m! {(- k)}_{m}}{{(2 r + 1)}_{m} {(2 r + k + 2)}_{m}},

so

B (m) = c_{1} B_{1} (m) + c_{2} B_{2} (m),

which finally results in

\begin{matrix} {}_{5}F_{4} (\begin{matrix} - m, m + 2 r + 1, r, r + 2, 2 r + 1 + k \\ 2 r + 1, r + 1, r + 1, 2 r + 2 + k \end{matrix}| 1) \\ = \frac{1}{(r + 1) (r + 1 + k)} \frac{m!}{{(2 r + 1)}_{m}} (2 r + 1 + k + r (r + k) \frac{{(- k)}_{m}}{{(2 r + 2 + k)}_{m}}) . \end{matrix}

(192)

In order to compute

var (f_{r + n} (x, r)) = \int_{0}^{1} x^{2 r} Q_{n}^{2} (x; r) d x - {(\int_{0}^{1} x^{r} Q_{n} (x; r) d x)}^{2},

it is enough to take

m = n

in (191) and then use (192) to arrive at

\begin{matrix} B^{*} = \sum_{k = 0}^{n} \frac{{(- n)}_{k} {(n + 2 r + 1)}_{k} {(r)}_{k} {(r + 2)}_{k}}{{(2 r + 2)}_{k} {(r + 1)}_{k} {(r + 1)}_{k} k!} {}_{5}F_{4} (\begin{matrix} - n, n + 2 r + 1, r, r + 2, 2 r + 1 + k \\ 2 r + 1, r + 1, r + 1, 2 r + 2 + k \end{matrix}| 1) \\ = \frac{n!}{{(2 r + 1)}_{n}} \frac{2 r + 1}{{(r + 1)}^{2}} \sum_{k = 0}^{n} \frac{{(- n)}_{k} {(n + 2 r + 1)}_{k} {(r)}_{k}}{{(2)}_{k} {(- r + 1)}_{k} {(r + 1)}_{k} k!} \\ + \frac{n!}{{(2 r + 1)}_{n}} \frac{r^{2}}{{(r + 1)}^{2} {(2 r + 2)}_{n}} \sum_{k = 0}^{n} \frac{{(- n)}_{k} {(n + 2 r + 1)}_{k} {(- k)}_{n}}{{(n + 2 r + 2)}_{k} k!}, \end{matrix}

which is finally simplified as

B^{*} = \frac{2 r + 1}{{(r + 1)}^{2}} \frac{{(n!)}^{2}}{{(2 r + 1)}_{n} {(2 r + 1)}_{n}} + \frac{r^{2}}{{(r + 1)}^{2}} \frac{{(n!)}^{2} {(n + 2 r + 1)}_{n}}{{(2 r + 1)}_{n} {(2 r + 2)}_{n} {(n + 2 r + 2)}_{n}} .

Corollary 6.

If

r > - 1 / 2

and

r \neq 0

, then

\int_{0}^{1} x^{2 r} Q_{n}^{2} (x; r) d x - {(\int_{0}^{1} x^{r} Q_{n} (x; r) d x)}^{2} = \frac{1}{2 n + 2 r + 1} {(\frac{r n!}{(r + 1) {(2 r + 1)}_{n}})}^{2} .

Moreover, (191) is simplified as

\begin{matrix} \int_{0}^{1} x^{2 r} Q_{n} (x; r) Q_{m} (x; r) d x - \int_{0}^{1} x^{r} Q_{n} (x; r) d x \int_{0}^{1} x^{r} Q_{m} (x; r) d x \\ = \frac{r^{2} m!}{{(r + 1)}^{2} (2 r + 1) {(2 r + 1)}_{m} {(2 r + 2)}_{m}} \sum_{k = 0}^{n} \frac{{(n + 2 r + 1)}_{k} {(- n)}_{k}}{{(m + 2 r + 2)}_{k}} \frac{{(- k)}_{m}}{k!} = 0 \Leftrightarrow m \neq n . \end{matrix}

Therefore,

\sum_{k = m}^{n} \frac{{(n + 2 r + 1)}_{k} {(- n)}_{k}}{{(m + 2 r + 2)}_{k}} \frac{{(- k)}_{m}}{k!} = n! \frac{n + 2 r + 1}{2 n + 2 r + 1} δ_{n, m},

(193)

which is a generalization of the second result of Corollary 5 for

r = 0

.

In Section 10, we will prove that both polynomials obtained in Section 7 and Section 8 are particular cases of a two-parametric sequence of uncorrelated polynomials.

9. A Class of Uncorrelated Polynomials Based on a Predetermined Orthogonal Polynomial

In this section, we introduce a class of uncorrelated polynomials which is constructed through a predetermined sequence of orthogonal polynomials, and then we study its general properties in detail. In order to clarify the subject, we also present two hypergeometric examples based on Jacobi and Laguerre polynomials. First of all, we should raise two important points. Let

{Φ_{n} (x)}_{n = 0}

(with

Φ_{0} (x) = 1

valid especially for polynomial sequences) be a sequence of real functions orthogonal with respect to

w (x)

on

[a, b]

, i.e.,

{E (Φ_{m} (x) Φ_{n} (x))|}_{w (x)} = {E (Φ_{n}^{2} (x))|}_{w (x)} δ_{m, n},

(194)

where

{(.)|}_{w (x)}

means

P_{r} (X = x) = \frac{w (x)}{\int_{a}^{b} w (x) d x}

, as before.

The first point is that, if the mentioned sequence has the special form

Φ_{n} (x) = θ (x) g_{n} (x) + β_{n} (with Φ_{0} (x) = 1),

in which

θ (x)

is a function independent of n, and

{β_{n}}

is an arbitrary numeric sequence, then noting (194) and (189) in Remark 8, we have

\begin{matrix} {cov (Φ_{m} (x), Φ_{n} (x))|}_{w (x)} \\ = {E (Φ_{m} (x) Φ_{n} (x))|}_{w (x)} - {E (Φ_{m} (x) Φ_{0} (x))|}_{w (x)} {E (Φ_{n} (x) Φ_{0} (x))|}_{w (x)} \\ = {E (Φ_{n}^{2} (x))|}_{w (x)} δ_{m, n} = {cov (θ (x) g_{m} (x) + β_{m}, θ (x) g_{n} (x) + β_{n})|}_{w (x)} \\ = {cov (θ (x) g_{m} (x), θ (x) g_{n} (x))|}_{w (x)} \\ = \frac{\int_{a}^{b} w (x) θ^{2} (x) d x}{\int_{a}^{b} w (x) d x} {{cov}_{1} (g_{m} (x), g_{n} (x); \frac{1}{θ (x)})|}_{w (x) θ^{2} (x)}, \end{matrix}

(195)

leading to the following biorthogonality relation according to the Section 4.1,

{E (g_{m} (x) (g_{n} (x) - \frac{E (g_{n} (x) / θ (x))}{E (1 / θ^{2} (x))} \frac{1}{θ (x)}))|}_{w (x) θ^{2} (x)} = \frac{\int_{a}^{b} w (x) d x}{\int_{a}^{b} w (x) θ^{2} (x) d x} {E (Φ_{n}^{2} (x))|}_{w (x)} δ_{m, n} .

Also, if it satisfies a second-order equation as

a (x) Φ_{n}^{″} (x) + b (x) Φ_{n}^{'} (x) + λ_{n} u (x) Φ_{n} (x) = 0,

then

{g_{n} (x)}_{n = 0}

will satisfy the equation

\begin{matrix} (a (x) θ (x)) g_{n}^{″} (x) + (2 a (x) θ^{'} (x) + b (x) θ (x)) g_{n}^{'} (x) \\ + (a (x) θ^{″} (x) + b (x) θ^{'} (x) + λ_{n} u (x) θ (x)) g_{n} (x) = - λ_{n} β_{n} u (x) . \end{matrix}

(196)

In this sense, the second point is that the relation

\frac{\int_{a}^{b} w (x) θ (x) g_{n} (x) d x}{\int_{a}^{b} w (x) d x} = \frac{\int_{a}^{b} w (x) (Φ_{n} (x) - β_{n}) d x}{\int_{a}^{b} w (x) d x} = - β_{n},

will change Equation (196) to

\begin{matrix} (a (x) θ (x)) g_{n}^{″} (x) + (2 a (x) θ^{'} (x) + b (x) θ (x)) g_{n}^{'} (x) \\ + (a (x) θ^{″} (x) + b (x) θ^{'} (x) + λ_{n} u (x) θ (x)) g_{n} (x) = λ_{n} u (x) \frac{\int_{a}^{b} w (x) θ (x) g_{n} (x) d x}{\int_{a}^{b} w (x) d x}, \end{matrix}

which is a particular case of Equation (65) in Theorem 7.

Now, noting the two above-mentioned points, for a real parameter

λ

let

{\{P_{n} (x; λ) = \sum_{k = 0}^{n} a_{k}^{(n)} {(x - λ)}^{k}\}}_{n = 0},

be a sequence of polynomials orthogonal with respect to

w (x)

on

[a, b]

, such that

{E (P_{m} (x; λ) P_{n} (x; λ))|}_{w (x)} = {E (P_{n}^{2} (x; λ))|}_{w (x)} δ_{m, n} .

(197)

It can be verified that the sequence

Q_{n} (x; λ) = \frac{P_{n + 1} (x; λ) - P_{n + 1} (λ; λ)}{x - λ} = \sum_{k = 0}^{n} a_{k + 1}^{(n + 1)} {(x - λ)}^{k},

(198)

is also a polynomial of degree n. With reference to (195) and (197), the following relations hold for the polynomial sequence (198),

\begin{matrix} \frac{\int_{a}^{b} w (x) {(x - λ)}^{2} d x}{\int_{a}^{b} w (x) d x} {{cov}_{1} (Q_{m} (x; λ), Q_{n} (x; λ); \frac{1}{x - λ})|}_{w (x) {(x - λ)}^{2}} \\ = {cov ((x - λ) Q_{m} (x; λ), (x - λ) Q_{n} (x; λ))|}_{w (x)} \\ = {cov (P_{m + 1} (x; λ) - P_{m + 1} (λ; λ), P_{n + 1} (x; λ) - P_{n + 1} (λ; λ))|}_{w (x)} \\ = {cov (P_{m + 1} (x; λ), P_{n + 1} (x; λ))|}_{w (x)} = {E (P_{n + 1}^{2} (x; λ))|}_{w (x)} δ_{m, n} . \end{matrix}

(199)

Corollary 7.

From (199), the relation

{{cov}_{1} (Q_{m} (x; λ), Q_{n} (x; λ); \frac{1}{x - λ})|}_{w (x) {(x - λ)}^{2}} = \frac{\int_{a}^{b} w (x) P_{n + 1}^{2} (x; λ) d x}{\int_{a}^{b} w (x) {(x - λ)}^{2} d x} δ_{m, n},

shows that the polynomial set

{\{Q_{n} (x; λ)\}}_{n = 0}

is a complete uncorrelated sequence with respect to the fixed function

z (x) = \frac{1}{x - λ}

and the probability function

P_{r} (X = x) = \frac{w (x) {(x - λ)}^{2}}{\int_{a}^{b} w (x) {(x - λ)}^{2} d x}

, respectively. Also, relation (199) shows that the two defined sequences

{\{P_{n} (x; λ) = \sum_{k = 0}^{n} a_{k}^{(n)} {(x - λ)}^{k}\}}_{n = 0} and {\{Q_{n} (x; λ) = \sum_{k = 0}^{n} a_{k + 1}^{(n + 1)} {(x - λ)}^{k}\}}_{n = 0}

are biorthogonal with respect to the weight function

(x - λ) w (x)

on

[a, b]

, as we have

\begin{matrix} {{cov}_{1} (Q_{m} (x; λ), Q_{n} (x; λ); \frac{1}{x - λ})|}_{w (x) {(x - λ)}^{2}} \\ = E {(Q_{m} (x; λ) (Q_{n} (x; λ) - \frac{E (Q_{n} (x; λ) / (x - λ))}{E (1 / {(x - λ)}^{2})} \frac{1}{x - λ}))|}_{w (x) {(x - λ)}^{2}}, \end{matrix}

where

Q_{n} (x; λ) - \frac{E (Q_{n} (x; λ) / (x - λ))}{E (1 / {(x - λ)}^{2})} \frac{1}{x - λ} = \frac{P_{n + 1} (x; λ)}{x - λ} .

There is a direct proof for this result in an inner product space, too. If we suppose

{〈P_{m} (x; λ), P_{n} (x; λ)〉}_{w (x)} = {〈P_{n} (x; λ), P_{n} (x; λ)〉}_{w (x)} δ_{m, n},

then, for every

m \in N

, we have

\begin{matrix} {〈Q_{n} (x; λ), P_{m} (x; λ)〉}_{(x - λ) w (x)} = {〈\frac{P_{n + 1} (x; λ) - P_{n + 1} (λ; λ)}{x - λ}, P_{m} (x; λ)〉}_{(x - λ) w (x)} \\ = {〈P_{n + 1} (x; λ), P_{m} (x; λ)〉}_{w (x)} - P_{n + 1} (λ; λ) {〈1, P_{m} (x; λ)〉}_{w (x)} \\ = {〈P_{n + 1} (x; λ), P_{n + 1} (x; λ)〉}_{w (x)} δ_{n + 1, m} . \end{matrix}

(200)

Using (200), one can create a biorthogonal approximation (or expansion as

n \to \infty

) for any appropriate function, say

f (x)

, in terms of the uncorrelated polynomials

{\{Q_{k} (x; λ)\}}_{k = 0}

as follows:

f (x) ≅ \sum_{k = 0}^{n} c_{k} Q_{k} (x; λ) = \sum_{k = 0}^{n} \frac{{〈f (x), P_{k + 1} (x; λ)〉}_{(x - λ) w (x)}}{{〈P_{k + 1} (x; λ), P_{k + 1} (x; λ)〉}_{w (x)}} \frac{P_{k + 1} (x; λ) - P_{k + 1} (λ; λ)}{x - λ},

whose error is clearly minimized with respect to the fixed function

z (x) = \frac{1}{x - λ}

in the sense of least 1-variances. In the sequel, since

{\{P_{n} (x; λ)\}}_{n = 0}

was assumed to be orthogonal, its monic type must satisfy a three term recurrence relation [25] of the form

{\bar{P}}_{n + 1} (x; λ) = (x - B_{n}) {\bar{P}}_{n} (x; λ) - C_{n} {\bar{P}}_{n - 1} (x; λ) with {\bar{P}}_{0} (x; λ) = 1 and {\bar{P}}_{1} (x; λ) = x - B_{1} .

(201)

After doing some computations are performed in hand, substituting (198) into (201) gives

{\bar{Q}}_{n + 1} (x; λ) = (x - B_{n + 1}) {\bar{Q}}_{n} (x; λ) - C_{n + 1} {\bar{Q}}_{n - 1} (x; λ) + {\bar{P}}_{n + 1} (λ; λ) with {\bar{Q}}_{0} (x; λ) = 1 .

(202)

This type of recurrence relation in (202) helps us obtain an analogue of the well-known Christoffel–Darboux identity [25] as follows. We have, respectively, in (202),

\begin{matrix} (x {\bar{Q}}_{n} (x; λ) + {\bar{P}}_{n + 1} (λ; λ)) {\bar{Q}}_{n} (t; λ) = {\bar{Q}}_{n + 1} (x; λ) {\bar{Q}}_{n} (t; λ) \\ + B_{n + 1} {\bar{Q}}_{n} (x; λ) {\bar{Q}}_{n} (t; λ) + C_{n + 1} {\bar{Q}}_{n - 1} (x; λ) {\bar{Q}}_{n} (t; λ), \end{matrix}

and

\begin{matrix} (t {\bar{Q}}_{n} (t; λ) + {\bar{P}}_{n + 1} (λ; λ)) {\bar{Q}}_{n} (x; λ) = {\bar{Q}}_{n + 1} (t; λ) {\bar{Q}}_{n} (x; λ) \\ + B_{n + 1} {\bar{Q}}_{n} (t; λ) {\bar{Q}}_{n} (x; λ) + C_{n + 1} {\bar{Q}}_{n - 1} (t; λ) {\bar{Q}}_{n} (x; λ) . \end{matrix}

Therefore, by defining the kernel

G_{n} (x, t) = \frac{1}{\prod_{j = 1}^{n + 1} C_{j}} \frac{{\bar{Q}}_{n + 1} (x; λ) {\bar{Q}}_{n} (t; λ) - {\bar{Q}}_{n + 1} (t; λ) {\bar{Q}}_{n} (x; λ)}{x - t},

we eventually obtain

\begin{matrix} \sum_{n = 0}^{m} \frac{1}{\prod_{j = 1}^{n + 1} C_{j}} ({\bar{Q}}_{n} (x; λ) {\bar{Q}}_{n} (t; λ) - {\bar{P}}_{n + 1} (λ; λ) \frac{{\bar{Q}}_{n} (x; λ) - {\bar{Q}}_{n} (t; λ)}{x - t}) \\ = \sum_{n = 0}^{m} G_{n} (x, t) - G_{n - 1} (x, t) = \frac{1}{\prod_{j = 1}^{m + 1} C_{j}} \frac{{\bar{Q}}_{m + 1} (x; λ) {\bar{Q}}_{m} (t; λ) - {\bar{Q}}_{m + 1} (t; λ) {\bar{Q}}_{m} (x; λ)}{x - t} . \end{matrix}

(203)

We now introduce two uncorrelated polynomials of the hypergeometric type that are constructed based on Jacobi and Laguerre polynomials, and then we apply all the above-mentioned results on them.

9.1. An Uncorrelated Sequence of Hypergeometric Polynomials of ${}_{3}F_{2}$ Type

The monic type of Jacobi polynomials [20]

{\bar{P}}_{n}^{(α, β)} (x) = \frac{2^{n} {(α + 1)}_{n}}{{(n + α + β + 1)}_{n}} {}_{2}F_{1} (\begin{matrix} - n, n + α + β + 1 \\ α + 1 \end{matrix}| \frac{1 - x}{2}),

(204)

satisfy the equation

(1 - x^{2}) \frac{d^{2}}{d x^{2}} {\bar{P}}_{n}^{(α, β)} (x) - ((α + β + 2) x + α - β) \frac{d}{d x} {\bar{P}}_{n}^{(α, β)} (x) + n (n + α + β + 1) {\bar{P}}_{n}^{(α, β)} (x) = 0,

(205)

and are orthogonal with respect to the weight function

{(1 - x)}^{α} {(1 + x)}^{β}

on

[- 1, 1]

as

\begin{matrix} \int_{- 1}^{1} {(1 - x)}^{α} {(1 + x)}^{β} {\bar{P}}_{m}^{(α, β)} (x) {\bar{P}}_{n}^{(α, β)} (x) d x \\ = n! 2^{2 n + α + β + 1} \frac{Γ (n + α + β + 1) Γ (n + α + 1) Γ (n + β + 1)}{Γ (2 n + α + β + 1) Γ (2 n + α + β + 2)} δ_{n, m} . \end{matrix}

(206)

Since

{\bar{P}}_{n}^{(α, β)} (- x) = {(- 1)}^{n} {\bar{P}}_{n}^{(β, α)} (x)

, another representation of these polynomials is as

{\bar{P}}_{n}^{(α, β)} (x) = \frac{{(- 1)}^{n} 2^{n} {(β + 1)}_{n}}{{(n + α + β + 1)}_{n}} {}_{2}F_{1} (\begin{matrix} - n, n + α + β + 1 \\ β + 1 \end{matrix}| \frac{1 + x}{2}) .

(207)

They also satisfy a three term recurrence relation in the form

\begin{matrix} {\bar{P}}_{n + 1}^{(α, β)} (x) = (x - \frac{β^{2} - α^{2}}{(2 n + α + β) (2 n + α + β + 2)}) {\bar{P}}_{n}^{(α, β)} (x) \\ - 4 \frac{n (n + α) (n + β) (n + α + β)}{(2 n + α + β + 1) {(2 n + α + β)}^{2} (2 n + α + β - 1)} {\bar{P}}_{n - 1}^{(α, β)} (x) . \end{matrix}

(208)

Representations (204) and (207) show that there are two specific values for

λ

in (198), i.e.,

λ = 1

and

λ = - 1

. Noting that

{\bar{P}}_{n}^{(α, β)} (1) = \frac{2^{n} {(α + 1)}_{n}}{{(n + α + β + 1)}_{n}},

the first kind of uncorrelated polynomials is defined as

\begin{matrix} {\bar{Q}}_{n}^{(α, β)} (x; 1) = \frac{{\bar{P}}_{n + 1}^{(α, β)} (1) - {\bar{P}}_{n + 1}^{(α, β)} (x)}{1 - x} \\ = \frac{(n + 1) 2^{n} {(α + 2)}_{n}}{{(n + α + β + 3)}_{n}} {}_{3}F_{2} (\begin{matrix} - n, n + α + β + 3, 1 \\ α + 2, 2 \end{matrix}| \frac{1 - x}{2}) . \end{matrix}

(209)

Also, for

λ = - 1

the second kind is defined by

\begin{matrix} {\bar{Q}}_{n}^{(α, β)} (x; - 1) = \frac{{\bar{P}}_{n + 1}^{(α, β)} (x) - {\bar{P}}_{n + 1}^{(α, β)} (- 1)}{x + 1} = \frac{{(- 1)}^{n + 1} {\bar{P}}_{n + 1}^{(β, α)} (- x) - {(- 1)}^{n + 1} {\bar{P}}_{n + 1}^{(β, α)} (1)}{1 - (- x)} \\ = {(- 1)}^{n} {\bar{Q}}_{n}^{(β, α)} (- x; 1) = \frac{(n + 1) {(- 2)}^{n} {(β + 2)}_{n}}{{(n + α + β + 3)}_{n}} {}_{3}F_{2} (\begin{matrix} - n, n + α + β + 3, 1 \\ β + 2, 2 \end{matrix}| \frac{1 + x}{2}) . \end{matrix}

(210)

Relation (210) reveals that we, in fact, deal with only one value, i.e.,

λ = 1

.

If in (209),

{\bar{P}}_{n + 1}^{(α, β)} (x) = (x - 1) {\bar{Q}}_{n}^{(α, β)} (x; 1) + {\bar{P}}_{n + 1}^{(α, β)} (1)

is substituted into the differential Equation (205), we obtain

\begin{matrix} {(1 - x)}^{2} (1 + x) \frac{d^{2}}{d x^{2}} {\bar{Q}}_{n}^{(α, β)} (x; 1) - (1 - x) ((α + β + 4) x + α - β + 2) \frac{d}{d x} {\bar{Q}}_{n}^{(α, β)} (x; 1) \\ + (n (n + α + β + 3) (1 - x) + 2 α + 2) {\bar{Q}}_{n}^{(α, β)} (x; 1) \\ = (n + 1) (n + α + β + 2) {\bar{P}}_{n + 1}^{(α, β)} (1) = \frac{(n + 1) (n + α + β + 2) 2^{n} {(α + 1)}_{n}}{{(n + α + β + 1)}_{n}} . \end{matrix}

(211)

On the other hand,

\begin{matrix} \int_{- 1}^{1} {(1 - x)}^{α + 1} {(1 + x)}^{β} {\bar{Q}}_{n}^{(α, β)} (x; 1) d x = \int_{- 1}^{1} {(1 - x)}^{α} {(1 + x)}^{β} ({\bar{P}}_{n + 1}^{(α, β)} (1) - {\bar{P}}_{n + 1}^{(α, β)} (x)) d x \\ = {\bar{P}}_{n + 1}^{(α, β)} (1) \int_{- 1}^{1} {(1 - x)}^{α} {(1 + x)}^{β} d x = {\bar{P}}_{n + 1}^{(α, β)} (1) 2^{α + β + 1} \frac{Γ (α + 1) Γ (β + 1)}{Γ (α + β + 2)}, \end{matrix}

changes Equation (211) to

\begin{matrix} {(1 - x)}^{2} (1 + x) \frac{d^{2}}{d x^{2}} {\bar{Q}}_{n}^{(α, β)} (x; 1) - (1 - x) ((α + β + 4) x + α - β + 2) \frac{d}{d x} {\bar{Q}}_{n}^{(α, β)} (x; 1) \\ + (γ_{n}^{*} (1 - x) + 2 α + 2) {\bar{Q}}_{n}^{(α, β)} (x; 1) \\ = \frac{Γ (α + β + 2) (γ_{n}^{*} + α + β + 2)}{2^{α + β + 1} Γ (α + 1) Γ (β + 1)} \int_{- 1}^{1} {(1 - x)}^{α + 1} {(1 + x)}^{β} {\bar{Q}}_{n}^{(α, β)} (x; 1) d x, \end{matrix}

(212)

in which

γ_{n}^{*} = n (n + α + β + 3)

.

Theorem 14.

For any

α, β > - 1

, we have

\begin{matrix} {{cov}_{1} ({\bar{Q}}_{m}^{(α, β)} (x; 1), {\bar{Q}}_{n}^{(α, β)} (x; 1); \frac{1}{1 - x})|}_{{(1 - x)}^{α + 2} {(1 + x)}^{β}} \\ = n! 2^{2 n - 2} \frac{Γ (α + β + 4) Γ (n + α + β + 1) Γ (n + α + 1) Γ (n + β + 1)}{Γ (α + 3) Γ (β + 1) Γ (2 n + α + β + 1) Γ (2 n + α + β + 2)} δ_{m, n} . \end{matrix}

Proof.

We would like to prove this theorem via differential Equation (212) so that, if it is written in the self-adjoint form

\begin{matrix} \frac{d}{d x} ({(1 - x)}^{α + 3} {(1 + x)}^{β + 1} \frac{d}{d x} {\bar{Q}}_{n}^{(α, β)} (x; 1)) \\ + (γ_{n}^{*} {(1 - x)}^{α + 2} {(1 + x)}^{β} + (2 α + 2) {(1 - x)}^{α + 1} {(1 + x)}^{β}) {\bar{Q}}_{n}^{(α, β)} (x; 1) \\ = \frac{Γ (α + β + 2) (γ_{n}^{*} + α + β + 2)}{2^{α + β + 1} Γ (α + 1) Γ (β + 1)} (\int_{- 1}^{1} {(1 - x)}^{α + 1} {(1 + x)}^{β} {\bar{Q}}_{n}^{(α, β)} (x; 1) d x) {(1 - x)}^{α + 1} {(1 + x)}^{β}, \end{matrix}

then

\begin{matrix} {[{(1 - x)}^{α + 3} {(1 + x)}^{β + 1} ({\bar{Q}}_{m}^{(α, β)} (x; 1) \frac{d}{d x} {\bar{Q}}_{n}^{(α, β)} (x; 1) - {\bar{Q}}_{n}^{(α, β)} (x; 1) \frac{d}{d x} {\bar{Q}}_{m}^{(α, β)} (x; 1))]}_{- 1}^{1} \\ + (γ_{n}^{*} - γ_{m}^{*}) \int_{- 1}^{1} {(1 - x)}^{α + 2} {(1 + x)}^{β} {\bar{Q}}_{n}^{(α, β)} (x; 1) {\bar{Q}}_{m}^{(α, β)} (x; 1) d x \\ = \frac{Γ (α + β + 2)}{2^{α + β + 1} Γ (α + 1) Γ (β + 1)} (γ_{n}^{*} - γ_{m}^{*}) \\ \times (\int_{- 1}^{1} {(1 - x)}^{α + 1} {(1 + x)}^{β} {\bar{Q}}_{n}^{(α, β)} (x; 1) d x) (\int_{- 1}^{1} {(1 - x)}^{α + 1} {(1 + x)}^{β} {\bar{Q}}_{m}^{(α, β)} (x; 1) d x), \end{matrix}

leading to the result

\begin{matrix} \int_{- 1}^{1} {(1 - x)}^{α + 2} {(1 + x)}^{β} {\bar{Q}}_{n}^{(α, β)} (x; 1) {\bar{Q}}_{m}^{(α, β)} (x; 1) d x \\ = \frac{(\int_{- 1}^{1} {(1 - x)}^{α + 1} {(1 + x)}^{β} {\bar{Q}}_{n}^{(α, β)} (x; 1) d x) (\int_{- 1}^{1} {(1 - x)}^{α + 1} {(1 + x)}^{β} {\bar{Q}}_{m}^{(α, β)} (x; 1) d x)}{\int_{- 1}^{1} {(1 - x)}^{α} {(1 + x)}^{β} d x} \\ = {\bar{P}}_{n + 1}^{(α, β)} (1) {\bar{P}}_{m + 1}^{(α, β)} (1) \int_{- 1}^{1} {(1 - x)}^{α} {(1 + x)}^{β} d x \\ = 2^{n + m + α + β + 1} \frac{Γ (α + 1) Γ (β + 1)}{Γ (α + β + 2)} \frac{{(α + 1)}_{n} {(α + 1)}_{m}}{{(n + α + β + 1)}_{n} {(m + α + β + 1)}_{m}} \Leftrightarrow n \neq m, \end{matrix}

(213)

which proves the first part. To obtain the variance value, i.e., for

n = m

in (213), it is enough to refer to Corollary 7 and then apply relation (206). □

Note that, in (208),

C_{j} = \frac{1}{4} \frac{j (j + α) (j + β) (j + α + β)}{(j + \frac{α + β + 1}{2}) {(j + \frac{α + β}{2})}^{2} (j + \frac{α + β - 1}{2})},

we have

\prod_{j = 1}^{m + 1} C_{j} = \frac{1}{4^{m + 1}} \frac{{(1)}_{m + 1} {(α + 1)}_{m + 1} {(β + 1)}_{m + 1} {(α + β + 1)}_{m + 1}}{{(\frac{α + β + 3}{2})}_{m + 1} {(\frac{α + β + 2}{2})}_{m + 1}^{2} {(\frac{α + β + 1}{2})}_{m + 1}} .

Hence, the identity (203) can be applied straightforwardly.

Some Particular Trigonometric Cases

There are four trigonometric cases of Jacobi polynomials that are known in the literature as the Chebyshev polynomials of the first, second, third, and fourth kinds. The main advantage of these trigonometric polynomials is that their roots are explicitly known [25]; see also [28]. Their monic forms are represented, respectively, as follows:

\begin{matrix} {\bar{T}}_{n} (x) & = {\bar{P}}_{n}^{(- \frac{1}{2}, - \frac{1}{2})} (x) = \frac{1}{2^{n - 1}} cos (n arccos x) = \prod_{k = 1}^{n} (x - cos \frac{(2 k - 1) π}{2 n}), \\ {\bar{U}}_{n} (x) & = {\bar{P}}_{n}^{(\frac{1}{2}, \frac{1}{2})} (x) = \frac{1}{2^{n} \sqrt{1 - x^{2}}} sin ((n + 1) arccos x) = \prod_{k = 1}^{n} (x - cos \frac{k π}{n + 1}), \\ {\bar{V}}_{n} (x) & = {\bar{P}}_{n}^{(- \frac{1}{2}, \frac{1}{2})} (x) = \frac{1}{2^{n}} \sqrt{\frac{2}{1 + x}} cos ((n + \frac{1}{2}) arccos x) = \prod_{k = 1}^{n} (x - cos \frac{(2 k - 1) π}{2 n + 1}), \\ {\bar{W}}_{n} (x) & = {\bar{P}}_{n}^{(\frac{1}{2}, - \frac{1}{2})} (x) = \frac{1}{2^{n}} \sqrt{\frac{2}{1 - x}} sin ((n + \frac{1}{2}) arccos x) = \prod_{k = 1}^{n} (x - cos \frac{2 k π}{2 n + 1}) . \end{matrix}

(214)

When it is noted that

T_{n} (x) = 2^{n - 1} {\bar{T}}_{n} (x), U_{n} (x) = 2^{n} {\bar{U}}_{n} (x), V_{n} (x) = 2^{n} {\bar{V}}_{n} (x) and W_{n} (x) = 2^{n} {\bar{W}}_{n} (x),

they also satisfy the following orthogonality relations:

\begin{matrix} \int_{- 1}^{1} T_{n} (x) T_{m} (x) \frac{1}{\sqrt{1 - x^{2}}} d x = \{\begin{matrix} \frac{π}{2} δ_{n, m}, \\ π if n = m = 0, \end{matrix} \\ \int_{- 1}^{1} U_{n} (x) U_{m} (x) \sqrt{1 - x^{2}} d x = \frac{π}{2} δ_{n, m}, \\ \int_{- 1}^{1} V_{n} (x) V_{m} (x) \sqrt{\frac{1 + x}{1 - x}} d x = π δ_{n, m}, \\ \int_{- 1}^{1} W_{n} (x) W_{m} (x) \sqrt{\frac{1 - x}{1 + x}} d x = π δ_{n, m} . \end{matrix}

(215)

Now, we can employ relations (214) and define four trigonometric uncorrelated sequences, according to the main definition (209), as follows:

\begin{matrix} {\bar{T}}_{n} (x; 1) = {\bar{Q}}_{n}^{(- \frac{1}{2}, - \frac{1}{2})} (x; 1) = \frac{1 - {\bar{T}}_{n + 1} (x)}{1 - x} \\ = \frac{(n + 1) 2^{n} {(3 / 2)}_{n}}{{(n + 2)}_{n}} {}_{3}F_{2} (\begin{matrix} - n, n + 2, 1 \\ 3 / 2, 2 \end{matrix}| \frac{1 - x}{2}) = \prod_{k = 1}^{n} (x - cos \frac{2 k π}{n + 1}), \end{matrix}

(216)

\begin{matrix} {\bar{U}}_{n} (x; 1) = {\bar{Q}}_{n}^{(\frac{1}{2}, \frac{1}{2})} (x; 1) = \frac{n + 2 - {\bar{U}}_{n + 1} (x)}{1 - x} \\ = \frac{(n + 1) 2^{n} {(5 / 2)}_{n}}{{(n + 4)}_{n}} {}_{3}F_{2} (\begin{matrix} - n, n + 4, 1 \\ 5 / 2, 2 \end{matrix}| \frac{1 - x}{2}), \end{matrix}

\begin{matrix} {\bar{V}}_{n} (x; 1) = {\bar{Q}}_{n}^{(- \frac{1}{2}, \frac{1}{2})} (x; 1) = \frac{1 - {\bar{V}}_{n + 1} (x)}{1 - x} \\ = \frac{(n + 1) 2^{n} {(3 / 2)}_{n}}{{(n + 3)}_{n}} {}_{3}F_{2} (\begin{matrix} - n, n + 3, 1 \\ 3 / 2, 2 \end{matrix}| \frac{1 - x}{2}) = \prod_{k = 1}^{n} (x - cos \frac{2 k π}{n}), \end{matrix}

(217)

\begin{matrix} {\bar{W}}_{n} (x; 1) = {\bar{Q}}_{n}^{(\frac{1}{2}, - \frac{1}{2})} (x; 1) = \frac{2 n + 3 - {\bar{W}}_{n + 1} (x)}{1 - x} \\ = \frac{(n + 1) 2^{n} {(5 / 2)}_{n}}{{(n + 3)}_{n}} {}_{3}F_{2} (\begin{matrix} - n, n + 3, 1 \\ 5 / 2, 2 \end{matrix}| \frac{1 - x}{2}) . \end{matrix}

As we observe, only relations (216) and (217) are decomposable having multiple roots. In this direction, it is worth mentioning that there is a generic decomposable sequence,

T_{n} (x; λ) = \frac{T_{n + 1} (x) - T_{n + 1} (λ)}{x - λ} = 2^{n} \prod_{k = 1}^{n} x - (λ cos \frac{2 k π}{n + 1} - \sqrt{1 - λ^{2}} sin \frac{2 k π}{n + 1}),

satisfying the non-homogeneous differential equation

\begin{matrix} (x - λ) (1 - x^{2}) \frac{d^{2}}{d x^{2}} T_{n} (x; λ) + (- 3 x^{2} + λ x + 2) \frac{d}{d x} T_{n} (x; λ) \\ + ({(n + 1)}^{2} (x - λ) - x) T_{n} (x; λ) = - {(n + 1)}^{2} cos ((n + 1) arccos λ), \end{matrix}

and the recurrence relation

T_{n + 1} (x; λ) = 2 x T_{n} (x; λ) - T_{n - 1} (x; λ) + 2 cos (n arccos λ) .

For instance,

T_{n} (x; 0) = \frac{T_{n + 1} (x) - T_{n + 1} (0)}{x} = 2^{n} \prod_{k = 1}^{n} (x + sin \frac{2 k π}{n + 1}),

is an uncorrelated polynomial with respect to the fixed function

z (x) = x^{- 1}

so that according to Corollary 7 and relation (215), we have

\int_{- 1}^{1} \frac{T_{n + 1}^{2} (x)}{\sqrt{1 - x^{2}}} d x = \int_{- 1}^{1} \frac{x^{2}}{\sqrt{1 - x^{2}}} d x = \frac{π}{2},

and as a result,

\begin{matrix} {{cov}_{1} (T_{m} (x; 0), T_{n} (x; 0); \frac{1}{x})|}_{\frac{x^{2}}{\sqrt{1 - x^{2}}}} = \int_{- 1}^{1} \frac{x^{2}}{\sqrt{1 - x^{2}}} T_{m} (x; 0) T_{n} (x; 0) d x \\ - \frac{1}{π} \int_{- 1}^{1} \frac{x}{\sqrt{1 - x^{2}}} T_{m} (x; 0) d x \int_{- 1}^{1} \frac{x}{\sqrt{1 - x^{2}}} T_{n} (x; 0) d x = δ_{m, n} . \end{matrix}

9.2. An Uncorrelated Sequence of Hypergeometric Polynomials of ${}_{2}F_{2}$ Type

At this turn, we consider the monic type of the (generalized) Laguerre polynomials [20,25]

{\bar{L}}_{n}^{(α)} (x) = {(- 1)}^{n} {(α + 1)}_{n} {}_{1}F_{1} (\begin{matrix} - n \\ α + 1 \end{matrix}| x),

satisfying the equation

x \frac{d^{2}}{d x^{2}} {\bar{L}}_{n}^{(α)} (x) + (α + 1 - x) \frac{d}{d x} {\bar{L}}_{n}^{(α)} (x) + n {\bar{L}}_{n}^{(α)} (x) = 0 .

They also satisfy the recurrence relation

{\bar{L}}_{n + 1}^{(α)} (x) = (x - 2 n - α - 1) {\bar{L}}_{n}^{(α)} (x) - n (n + α) {\bar{L}}_{n - 1}^{(α)} (x) .

Noting that

{\bar{L}}_{n}^{(α)} (0) = {(- 1)}^{n} {(α + 1)}_{n},

the uncorrelated polynomials based on Laguerre polynomials is defined as

{\bar{Q}}_{n}^{(α)} (x; 0) = \frac{{\bar{L}}_{n + 1}^{(α)} (x) - {\bar{L}}_{n + 1}^{(α)} (0)}{x} = {(- 1)}^{n} (n + 1) {(α + 2)}_{n} {}_{4}F_{2} (\begin{matrix} - n, 1 \\ α + 2, 2 \end{matrix}| x) .

(218)

Similar to the previous example, for

α > - 1

, we can prove that

{{cov}_{1} ({\bar{Q}}_{m}^{(α)} (x; 0), {\bar{Q}}_{n}^{(α)} (x; 0); \frac{1}{x})|}_{x^{α + 2} e^{- x}} = \frac{n! Γ (n + α + 1)}{Γ (α + 3)} δ_{m, n} .

10. A Unified Approach for the Polynomials Obtained in Section 7, Section 8 and Section 9

According to the distributions given in Table 1, only beta and gamma weight functions can be considered for the non-symmetric infinite cases of uncorrelated polynomials, as the normal distribution is somehow connected to a special case of gamma distribution. In this direction, the general properties of two polynomials, (177) and (187), reveal that the most general case of complete uncorrelated polynomials relevant to the beta weight function is when

w (x) = x^{a} {(1 - x)}^{b}

and

w (x) z (x) = x^{c} {(1 - x)}^{d}

, i.e.,

z (x) = x^{c - a} {(1 - x)}^{d - b}

where

a, b, c, d \in R

and

x \in [0, 1]

. Hence, if the corresponding uncorrelated polynomial is indicated as

P_{n} (x; a, b, c, d)

, we have

\begin{matrix} \int_{0}^{1} x^{a} {(1 - x)}^{b} P_{n} (x; a, b, c, d) P_{m} (x; a, b, c, d) d x \\ - \frac{Γ (2 c + 2 d - a - b + 2)}{Γ (2 c - a + 1) Γ (2 d - b + 1)} \int_{0}^{1} x^{c} {(1 - x)}^{d} P_{n} (x; a, b, c, d) d x \int_{0}^{1} x^{c} {(1 - x)}^{d} P_{m} (x; a, b, c, d) d x \\ = (\int_{0}^{1} x^{a} {(1 - x)}^{b} P_{n}^{2} (x; a, b, c, d) d x - \frac{Γ (2 c + 2 d - a - b + 2)}{Γ (2 c - a + 1) Γ (2 d - b + 1)} {(\int_{0}^{1} x^{c} {(1 - x)}^{d} P_{n} (x; a, b, c, d) d x)}^{2}) δ_{m, n}, \end{matrix}

(219)

provided that

2 c - a + 1 > 0, 2 d - b + 1 > 0 and b, d > - 1 .

The components of the determinant (172) corresponding to this generic polynomial are computed as follows

\begin{matrix} {{cov}_{1} (x^{i}, x^{j}; x^{c - a} {(1 - x)}^{d - b})|}_{w (x) = x^{a} {(1 - x)}^{b}} = Γ (b + 1) \frac{Γ (a + i + j + 1)}{Γ (a + b + i + j + 2)} \\ - \frac{Γ (2 c + 2 d - a - b + 2) Γ^{2} (d + 1)}{Γ (2 c - a + 1) Γ (2 d - b + 1)} \frac{Γ (c + i + 1) Γ (c + j + 1)}{Γ (c + d + i + 2) Γ (c + d + j + 2)}, \end{matrix}

(220)

in which

2 c - a + 1 > 0, 2 d - b + 1 > 0

and

b, d > - 1

.

According to the preceding information, the polynomials (177) can be represented as

{}_{4}F_{3} (\begin{matrix} - n, n + 1, - r, r + 2 \\ 1, - r + 1, r + 1 \end{matrix}| x) = P_{n} (x; 0, 0, r, 0),

(221)

and the polynomials (187) as

{}_{4}F_{3} (\begin{matrix} - n, n + 2 r + 1, r, r + 2 \\ 2 r + 1, r + 1, r + 1 \end{matrix}| x) = P_{n} (x; 2 r, 0, r, 0),

(222)

and noting (213), since

\begin{matrix} \int_{0}^{1} x^{α + 2} {(1 - x)}^{β} {}_{3}F_{2} (\begin{matrix} - n, n + α + β + 3, 1 \\ α + 2, 2 \end{matrix}| x) {}_{3}F_{2} (\begin{matrix} - m, m + α + β + 3, 1 \\ α + 2, 2 \end{matrix}| x) d x \\ = \frac{1}{\int_{0}^{1} x^{α} {(1 - x)}^{β} d x} (\int_{0}^{1} x^{α + 1} {(1 - x)}^{β} {}_{3}F_{2} (\begin{matrix} - n, n + α + β + 3, 1 \\ α + 2, 2 \end{matrix}| x) d x) \\ \times (\int_{0}^{1} x^{α + 1} {(1 - x)}^{β} {}_{3}F_{2} (\begin{matrix} - m, m + α + β + 3, 1 \\ α + 2, 2 \end{matrix}| x) d x) \Leftrightarrow n \neq m, \end{matrix}

the shifted polynomials (210) on

[0, 1]

are represented as

{}_{3}F_{2} (\begin{matrix} - n, n + α + β + 3, 1 \\ α + 2, 2 \end{matrix}| x) = P_{n} (x; α + 2, β, α + 1, β) .

Similarly, the most general case of complete uncorrelated polynomials relevant to the gamma weight function is when

w (x) = x^{a} e^{- b x}

and

w (x) z (x) = x^{c} e^{- d x}

, i.e.,

z (x) = x^{c - a} e^{- (d - b) x}

where

a, b, c, d \in R

and

x \in R^{+}

. Therefore, if the corresponding uncorrelated polynomial is indicated as

Q_{n} (x; a, b, c, d)

, then

\begin{matrix} \int_{0}^{\infty} x^{a} e^{- b x} Q_{n} (x; a, b, c, d) Q_{m} (x; a, b, c, d) d x \\ - \frac{{(2 d - b)}^{2 c - a + 1}}{Γ (2 c - a + 1)} \int_{0}^{\infty} x^{c} e^{- d x} Q_{n} (x; a, b, c, d) d x \int_{0}^{\infty} x^{c} e^{- d x} Q_{m} (x; a, b, c, d) d x \\ = (\int_{0}^{\infty} x^{a} e^{- b x} Q_{n}^{2} (x; a, b, c, d) d x - \frac{{(2 d - b)}^{2 c - a + 1}}{Γ (2 c - a + 1)} {(\int_{0}^{\infty} x^{c} e^{- d x} Q_{n} (x; a, b, c, d) d x)}^{2}) δ_{m, n}, \end{matrix}

provided that

2 c - a + 1 > 0, 2 d - b > 0 and b, d > 0 .

The components of the determinant (172) corresponding to this second generic polynomial are computed as

\begin{matrix} {{cov}_{1} (x^{i}, x^{j}; x^{c - a} e^{- (d - b) x})|}_{w (x) = x^{a} e^{- b x}} \\ = \frac{Γ (a + i + j + 1)}{b^{a + i + j + 1}} - \frac{{(2 d - b)}^{2 c - a + 1}}{Γ (2 c - a + 1)} \frac{Γ (c + i + 1) Γ (c + j + 1)}{d^{2 c + i + j + 2}}, \end{matrix}

in which

2 c - a + 1 > 0, 2 d - b > 0 and b, d > 0

.

As a sample, the polynomials (218) can be represented as

{}_{2}F_{2} (\begin{matrix} - n, 1 \\ α + 2, 2 \end{matrix}| x) = Q_{n} (x; α + 2, 1, α + 1, 1) .

A Basic Example of Uncorrelated Hypergeometric Polynomials of ${}_{4}F_{3}$ Type

In this section, we are going to obtain the explicit form of

P_{n} (x; a, 0, c, 0)

using an interesting technique. Since

b = d = 0

,

w (x) = x^{a}

and

z (x) = x^{c - a}

defined on

[0, 1]

. In the first step, we simplify (220) for

b = d = 0

as follows:

{{cov}_{1} (x^{i}, x^{j}; x^{c - a})|}_{w (x) = x^{a}} = \frac{(c - a - i) (c - a - j)}{(i + c + 1) (j + c + 1) (i + j + a + 1)},

where

2 c - a + 1 > 0, a \neq c and a, c > - 1

.

Referring to the results (221) and (222) and the fact that

P_{n} (x; a, 0, c, 0)

is a generalization of both of them, we can imagine that it is of the

{}_{4}F_{3}

type without a loss in generality.

Since, in a

{}_{4}F_{3}

-type polynomial of the form

{}_{4}F_{3} (\begin{matrix} - n, a_{2}, a_{3}, a_{4} \\ b_{1}, b_{2}, b_{3} \end{matrix}| x) = \sum_{k = 0}^{n} u_{k} x^{k} where u_{k} = \frac{{(- n)}_{k} {(a_{2})}_{k} {(a_{3})}_{k} {(a_{4})}_{k}}{{(b_{1})}_{k} {(b_{2})}_{k} {(b_{3})}_{k} k!},

we have

\frac{u_{n - 1}}{u_{n}} = - \frac{n (n + b_{1} - 1) (n + b_{2} - 1) (n + b_{3} - 1)}{(n + a_{2} - 1) (n + a_{3} - 1) (n + a_{4} - 1)},

(223)

which is to be equal to the minus of the coefficient of

x^{n - 1}

in the determinant of the monic polynomial

{\bar{P}}_{n} (x; a, 0, c, 0)

in (172), if, for simplicity, we set

a_{i, j} = \frac{(c - a - i) (c - a - j)}{(i + c + 1) (j + c + 1) (i + j + a + 1)} for i = 0, 1, \dots, n - 1 and j = 0, 1, \dots, n,

to achieve our goal, we should, therefore, compute the two following determinants (according to the main determinant (172)),

M_{n} = |\begin{matrix} a_{0, 0} & a_{0, 1} & \dots & a_{0, n - 2} & a_{0, n - 1} \\ a_{1, 0} & a_{1, 1} & \dots & a_{1, n - 2} & a_{1, n - 1} \\ ⋮ & ⋮ & ⋮ & ⋮ & ⋮ \\ a_{n - 1, 0} & a_{n - 1, 1} & \dots & a_{n - 1, n - 2} & a_{n - 1, n - 1} \end{matrix}|,

and

N_{n} = |\begin{matrix} a_{0, 0} & a_{0, 1} & \dots & a_{0, n - 2} & a_{0, n} \\ a_{1, 0} & a_{1, 1} & \dots & a_{1, n - 2} & a_{1, n} \\ ⋮ & ⋮ & ⋮ & ⋮ & ⋮ \\ a_{n - 1, 0} & a_{n - 1, 1} & \dots & a_{n - 1, n - 2} & a_{n - 1, n} \end{matrix}|,

and then obtain the quotient

- N_{n} / M_{n}

(with the aid of advanced mathematical software) as

- \frac{N_{n}}{M_{n}} = - \frac{n (n + a) (n + a - c) (n + c)}{(2 n + a) (n + a - c - 1) (n + c + 1)},

and compare it with (223) to finally obtain the explicit form of the polynomials as

P_{n} (x; a, 0, c, 0) = {}_{4}F_{3} (\begin{matrix} - n, n + a + 1, a - c, c + 2 \\ a + 1, a - c + 1, c + 1 \end{matrix}| x),

(224)

satisfying the uncorrelatedness condition (219) with

b = d = 0

, i.e.,

\begin{matrix} \int_{0}^{1} x^{a} P_{n} (x; a, 0, c, 0) P_{m} (x; a, 0, c, 0) d x \\ - (2 c - a + 1) \int_{0}^{1} x^{c} P_{n} (x; a, 0, c, 0) d x \int_{0}^{1} x^{c} P_{m} (x; a, 0, c, 0) d x \\ = (\int_{0}^{1} x^{a} P_{n}^{2} (x; a, 0, c, 0) d x - (2 c - a + 1) {(\int_{0}^{1} x^{c} P_{n} (x; a, 0, c, 0) d x)}^{2}) δ_{m, n} \\ \Leftrightarrow 2 c - a + 1 > 0, a \neq c and a, c > - 1 . \end{matrix}

(225)

For the limit case

2 c - a + 1 = 0

in (225), the polynomials (224) would reduce to the well-known special case of shifted Jacobi polynomials defined on

[0, 1]

, i.e.,

P_{n} (x; a, 0, \frac{a - 1}{2}, 0) = P_{n, +}^{(0, a)} (x) = {}_{2}F_{1} (\begin{matrix} - n, n + a + 1 \\ a + 1 \end{matrix}| x) (a > - 1) .

Also, we have

\begin{matrix} lim_{c \to \infty} P_{n} (x; a, 0, c, 0) & = lim_{c \to \infty} {}_{4}F_{3} (\begin{matrix} - n, n + a + 1, a - c, c + 2 \\ a + 1, a - c + 1, c + 1 \end{matrix}| x) \\ = {}_{2}F_{1} (\begin{matrix} - n, n + a + 1 \\ a + 1 \end{matrix}| x) = P_{n, +}^{(0, a)} (x), \end{matrix}

and

lim_{a \to \infty} P_{n} (x; a, 0, c, 0) = {}_{2}F_{1} (\begin{matrix} - n, c + 2 \\ c + 1 \end{matrix}| x) (c > - 1) .

As we mentioned, the polynomials (224) are a generalization of (177) or (221) for

(a, c) = (0, r)

and a generalization of (187) or (222) for

(a, c) = (2 r, r)

. In order to evaluate the existing integrals in (225), we first have

\begin{matrix} \int_{0}^{1} x^{c} P_{n} (x; a, 0, c, 0) d x = \sum_{k = 0}^{n} \frac{{(- n)}_{k} {(n + a + 1)}_{k} {(a - c)}_{k} {(c + 2)}_{k}}{{(a + 1)}_{k} {(a - c + 1)}_{k} {(c + 1)}_{k} k! (c + 1 + k)} \\ = \frac{1}{c + 1} {}_{3}F_{2} (\begin{matrix} - n, n + a + 1, a - c \\ a + 1, a - c + 1 \end{matrix}| 1) = \frac{1}{c + 1} \frac{n! {(c + 1)}_{n}}{{(a + 1)}_{n} {(a + 1 - c)}_{n}}, \end{matrix}

(226)

and

\begin{matrix} \int_{0}^{1} x^{a} P_{n} (x; a, 0, c, 0) P_{m} (x; a, 0, c, 0) d x = \frac{1}{a + 1} \sum_{k = 0}^{n} \frac{{(- n)}_{k} {(n + a + 1)}_{k} {(a - c)}_{k} {(c + 2)}_{k}}{{(a + 2)}_{k} {(a - c + 1)}_{k} {(c + 1)}_{k} k!} \\ \times {}_{5}F_{4} (\begin{matrix} - m, m + a + 1, a - c, c + 2, a + 1 + k \\ a + 1, a - c + 1, c + 1, a + 2 + k \end{matrix}| 1), \end{matrix}

(227)

which simplifies the left side of (225) as

\begin{matrix} \int_{0}^{1} x^{a} P_{n} (x; a, 0, c, 0) P_{m} (x; a, 0, c, 0) d x \\ - (2 c - a + 1) \int_{0}^{1} x^{c} P_{n} (x; a, 0, c, 0) d x \int_{0}^{1} x^{c} P_{m} (x; a, 0, c, 0) d x \\ = \frac{1}{a + 1} \sum_{k = 0}^{n} \frac{{(- n)}_{k} {(n + a + 1)}_{k} {(a - c)}_{k} {(c + 2)}_{k}}{{(a + 2)}_{k} {(a - c + 1)}_{k} {(c + 1)}_{k} k!} \\ \times {}_{5}F_{4} (\begin{matrix} - m, m + a + 1, a - c, c + 2, a + 1 + k \\ a + 1, a - c + 1, c + 1, a + 2 + k \end{matrix}| 1) \\ - \frac{2 c - a + 1}{{(c + 1)}^{2}} \frac{n! {(c + 1)}_{n}}{{(a + 1)}_{n} {(a + 1 - c)}_{n}} \frac{m! {(c + 1)}_{m}}{{(a + 1)}_{m} {(a + 1 - c)}_{m}} . \end{matrix}

(228)

To evaluate (226), we have again used the recurrence relations technique. Since

N (n) = {}_{3}F_{2} (\begin{matrix} - n, n + a + 1, a - c \\ a + 1, a - c + 1 \end{matrix}| 1),

satisfies the first-order relation

N (n + 1) = \frac{(n + 1) (n + c + 1)}{(n + a + 1) (n + a - c + 1)} N (n),

so

{}_{3}F_{2} (\begin{matrix} - n, n + a + 1, a - c \\ a + 1, a - c + 1 \end{matrix}| 1) = \frac{n! {(c + 1)}_{n}}{{(a + 1)}_{n} {(a + 1 - c)}_{n}} .

However, note that all results obtained in (226), (190) and (179) could also be derived using the Pfaff-Saalschutz summation theorem employed to derive (147). Similarly, to evaluate

{}_{5}F_{4} (.)

in (227), we can apply a recurrence technique as follows. Since

M (m) = {}_{5}F_{4} (\begin{matrix} - m, m + a + 1, a - c, c + 2, a + 1 + k \\ a + 1, a - c + 1, c + 1, a + 2 + k \end{matrix}| 1),

satisfies the second-order equation

\begin{matrix} (2 m + a + 2) (m + a + 2) (m + a + 1) (m + a + 2 - c) (m + a + 3 + k) M (m + 2) \\ - (2 m + a + 3) (m + a + 1) (m + 2) (2 m (m + a + 3) + (2 c - a) k + (a + 2) (c + 2)) M (m + 1) \\ + (2 m + a + 4) (m + 2) (m + 1) (m + 1 + c) (m - k) M (m) = 0, \end{matrix}

having two independent solutions

M_{1} (m) = \frac{m! {(c + 1)}_{m}}{{(a + 1)}_{m} {(a + 1 - c)}_{m}}, and M_{2} (m) = \frac{m! {(- k)}_{m}}{{(a + 1)}_{m} {(a + 2 + k)}_{m}},

so,

\begin{matrix} {}_{5}F_{4} (\begin{matrix} - m, m + a + 1, a - c, c + 2, a + 1 + k \\ a + 1, a - c + 1, c + 1, a + 2 + k \end{matrix}| 1) = \frac{m!}{(c + 1) (c + 1 + k) {(a + 1)}_{m}} \\ \times ((2 c - a + 1) (a + 1 + k) \frac{{(c + 1)}_{m}}{{(a + 1 - c)}_{m}} + (a - c) (a - c + k) \frac{{(- k)}_{m}}{{(a + 2 + k)}_{m}}) . \end{matrix}

(229)

Hence, in order to compute

\begin{matrix} {var}_{1} (P_{n} (x; a, 0, c, 0); x^{c - a}) \\ = \int_{0}^{1} x^{a} P_{n}^{2} (x; a, 0, c, 0) d x - (2 c - a + 1) {(\int_{0}^{1} x^{c} P_{n} (x; a, 0, c, 0) d x)}^{2}, \end{matrix}

we first suppose in (228) that

m = n

, and then we refer to (229) to arrive at

\begin{matrix} (a + 1) M^{*} & = \sum_{k = 0}^{n} \frac{{(- n)}_{k} {(n + a + 1)}_{k} {(a - c)}_{k} {(c + 2)}_{k}}{{(a + 2)}_{k} {(a - c + 1)}_{k} {(c + 1)}_{k} k!} \\ \times {}_{5}F_{4} (\begin{matrix} - n, n + a + 1, a - c, c + 2, a + 1 + k \\ a + 1, a - c + 1, c + 1, a + 2 + k \end{matrix}| 1) \\ = \frac{(a + 1) (2 c - a + 1)}{{(c + 1)}^{2}} \frac{{(n!)}^{2} {(c + 1)}_{n}^{2}}{{(a + 1)}_{n}^{2} {(a - c + 1)}_{n}^{2}} \\ + \frac{(a + 1) {(a - c)}^{2}}{{(c + 1)}^{2}} \frac{n!}{(n + a + 1) {(a + 1)}_{n}^{2}} \sum_{k = 0}^{n} \frac{{(- n)}_{k} {(n + a + 1)}_{k} {(- k)}_{n}}{{(n + a + 2)}_{k} k!} . \end{matrix}

Again, when it is noted that

{(- k)}_{n} = 0

for every

k < n

, the following final result will be derived.

Corollary 8.

If

2 c - a + 1 > 0, a \neq c and a, c > - 1

, then

\int_{0}^{1} x^{a} P_{n}^{2} (x; a, 0, c, 0) d x - (2 c - a + 1) {(\int_{0}^{1} x^{c} P_{n} (x; a, 0, c, 0) d x)}^{2} = \frac{1}{2 n + a + 1} {(\frac{(a - c) n!}{(c + 1) {(a + 1)}_{n}})}^{2} .

Moreover, (228) is simplified as

\begin{matrix} \int_{0}^{1} x^{a} P_{n} (x; a, 0, c, 0) P_{m} (x; a, 0, c, 0) d x \\ - (2 c - a + 1) \int_{0}^{1} x^{c} P_{n} (x; a, 0, c, 0) d x \int_{0}^{1} x^{c} P_{m} (x; a, 0, c, 0) d x \\ = \frac{{(a - c)}^{2}}{{(c + 1)}^{2}} \frac{m!}{(m + a + 1) {(a + 1)}_{m}^{2}} \sum_{k = 0}^{n} \frac{{(n + a + 1)}_{k} {(- n)}_{k}}{{(m + a + 2)}_{k}} \frac{{(- k)}_{m}}{k!} = 0 \Leftrightarrow m \neq n, \end{matrix}

leading to the same as relation (193) for

2 r = α > - 1

.

Corollary 9.

Using the latter corollary, we can now construct an optimized polynomial approximation (or expansion as

n \to \infty

) for

f (x)

, whose error 1-variance is minimized as follows:

f (x) ≅ \sum_{k = 0}^{n} \frac{A_{k}}{B_{k}} {}_{4}F_{3} (\begin{matrix} - k, k + a + 1, a - c, c + 2 \\ a + 1, a - c + 1, c + 1 \end{matrix}| x),

(230)

in which

\begin{matrix} A_{k} & = \int_{0}^{1} x^{a} f (x) {}_{4}F_{3} (\begin{matrix} - k, k + a + 1, a - c, c + 2 \\ a + 1, a - c + 1, c + 1 \end{matrix}| x) d x \\ - (2 c - a + 1) \int_{0}^{1} x^{c} f (x) d x \int_{0}^{1} x^{c} {}_{4}F_{3} (\begin{matrix} - k, k + a + 1, a - c, c + 2 \\ a + 1, a - c + 1, c + 1 \end{matrix}| x) d x \\ = \sum_{j = 0}^{k} \frac{{(- k)}_{j} {(k + a + 1)}_{j} {(a - c)}_{j} {(c + 2)}_{j}}{{(a + 1)}_{j} {(a - c + 1)}_{j} {(c + 1)}_{j} j!} \int_{0}^{1} x^{a + j} f (x) d x - \frac{2 c - a + 1}{{(c + 1)}^{2}} \frac{k! {(c + 1)}_{k}}{{(a + 1)}_{k} {(a + 1 - c)}_{k}} \int_{0}^{1} x^{c} f (x) d x, \end{matrix}

(231)

and

B_{k} = \frac{1}{2 k + a + 1} {(\frac{(a - c) k!}{(c + 1) {(a + 1)}_{k}})}^{2} .

(232)

We clearly observe in the polynomial-type approximation (230) that its basis do not satisfy an orthogonal condition but a complete uncorrelatedness condition. However, if we define the non-polynomial sequence

\begin{matrix} G_{n} (x; a, c) \\ = {}_{4}F_{3} (\begin{matrix} - n, n + a + 1, a - c, c + 2 \\ a + 1, a - c + 1, c + 1 \end{matrix}| x) - \frac{2 c - a + 1}{c + 1} \frac{n! {(c + 1)}_{n}}{{(a + 1)}_{n} {(a + 1 - c)}_{n}} x^{c - a}, \end{matrix}

(233)

satisfying the orthogonality condition

\begin{matrix} \int_{0}^{1} x^{a} G_{n} (x; a, c) G_{m} (x; a, c) d x = \frac{1}{2 n + a + 1} {(\frac{(a - c) n!}{(c + 1) {(a + 1)}_{n}})}^{2} δ_{m, n} \\ \Leftrightarrow 2 c - a + 1 > 0, a \neq c and a, c > - 1, \end{matrix}

(234)

we then obtain a non-polynomial type approximation (or expansion) as follows:

f (x) ≅ \sum_{k = 0}^{n} \frac{A_{k}^{*}}{B_{k}^{*}} G_{k} (x; a, c),

(235)

in which

A_{k}^{*} = A_{k} and B_{k}^{*} = B_{k}

according to the important Remark 1 and relation (53), i.e., as the same forms as (231) and (232). Once again, the orthogonal sequence (233) is generated only when the uncorrelated polynomial sequence (224) has been already generated. Also, the non-polynomial approximation (235) is optimized in the sense of ordinary least squares, while the polynomial-type approximation (230) is optimized in the sense of least 1-variances.

11. p-Uncorrelated Vectors with Respect to a Fixed Vector

The concept of p-uncorrelatedness can similarly be employed in vector spaces. Let

{\vec{A}}_{m} = (a_{1}, a_{2}, \dots, a_{m})

and

{\vec{B}}_{m} = (b_{1}, b_{2}, \dots, b_{m})

be two arbitrary vectors, and let

{\vec{I}}_{m} = (1, 1, \dots, 1)

denote a unit vector. Also, let

{\vec{Z}}_{m} = (z_{1}, z_{2}, \dots, z_{m})

be a fixed and predetermined vector. When the definition of the inner product of two vectors is recalled as

{\vec{A}}_{m} . {\vec{B}}_{m} = \sum_{k = 1}^{m} a_{k} b_{k},

it is not difficult to verify that

\begin{matrix} ({\vec{A}}_{m} - (1 - \sqrt{1 - p}) \frac{{\vec{A}}_{m} . {\vec{Z}}_{m}}{{\vec{Z}}_{m} . {\vec{Z}}_{m}} {\vec{Z}}_{m}) . ({\vec{B}}_{m} - (1 - \sqrt{1 - p}) \frac{{\vec{B}}_{m} . {\vec{Z}}_{m}}{{\vec{Z}}_{m} . {\vec{Z}}_{m}} {\vec{Z}}_{m}) \\ = {\vec{A}}_{m} . {\vec{B}}_{m} - p \frac{({\vec{A}}_{m} . {\vec{Z}}_{m}) ({\vec{B}}_{m} . {\vec{Z}}_{m})}{{\vec{Z}}_{m} . {\vec{Z}}_{m}} . \end{matrix}

(236)

For instance, if

{\vec{Z}}_{m} = {\vec{I}}_{m}

, then

\begin{matrix} ({\vec{A}}_{m} - (1 - \sqrt{1 - p}) \frac{{\vec{A}}_{m} . {\vec{I}}_{m}}{m} {\vec{I}}_{m}) . ({\vec{B}}_{m} - (1 - \sqrt{1 - p}) \frac{{\vec{B}}_{m} . {\vec{I}}_{m}}{m} {\vec{I}}_{m}) \\ = {\vec{A}}_{m} . {\vec{B}}_{m} - \frac{p}{m} ({\vec{A}}_{m} . {\vec{I}}_{m}) ({\vec{B}}_{m} . {\vec{I}}_{m}), \end{matrix}

where

{\vec{I}}_{m} . {\vec{I}}_{m} = m

and

p \in [0, 1]

. Relation (236) shows that the two vectors

{\vec{A}}_{m} and {\vec{B}}_{m}

are p-uncorrelated with respect to the fixed vector

{\vec{Z}}_{m}

if

p ({\vec{A}}_{m} . {\vec{Z}}_{m}) ({\vec{B}}_{m} . {\vec{Z}}_{m}) = ({\vec{A}}_{m} . {\vec{B}}_{m}) ({\vec{Z}}_{m} . {\vec{Z}}_{m}) .

Also, the notions of p-covariance and p-variance can be defined for these two vectors (with respect to ${\vec{Z}}_{m}$ ) as follows:

{cov}_{p} ({\vec{A}}_{m}, {\vec{B}}_{m}; {\vec{Z}}_{m}) = \frac{1}{m} ({\vec{A}}_{m} . {\vec{B}}_{m} - p \frac{({\vec{A}}_{m} . {\vec{Z}}_{m}) ({\vec{B}}_{m} . {\vec{Z}}_{m})}{{\vec{Z}}_{m} . {\vec{Z}}_{m}}),

(237)

and

{var}_{p} ({\vec{A}}_{m}; {\vec{Z}}_{m}) = \frac{1}{m} ({\vec{A}}_{m} . {\vec{A}}_{m} - p \frac{{({\vec{A}}_{m} . {\vec{Z}}_{m})}^{2}}{{\vec{Z}}_{m} . {\vec{Z}}_{m}}) \geq 0 .

(238)

Referring to the basic representation (43) and definitions (237) and (238), we can establish a set of p-uncorrelated vectors in terms of the parameter

p \in [0, 1]

if and only if

m = n + 1

and the finite set of initial vectors are linearly independent.

Example 2.

For

m = 3

, given three initial orthogonal vectors

{\vec{V}}_{0, 3} (1, 0, 0), {\vec{V}}_{1, 3} (0, 1, 0), {\vec{V}}_{2, 3} (0, 0, 1),

together with the fixed vector

{\vec{Z}}_{3} (1, 2, 3)

. Substituting them into (43) eventually yields

{\vec{X}}_{0, 3} (p) = (1, 0, 0), {\vec{X}}_{1, 3} (p) = (\frac{2 p}{14 - p}, 1, 0), {\vec{X}}_{2, 3} (p) = (\frac{3 p}{14 - 5 p}, \frac{6 p}{14 - 5 p}, 1),

(239)

which satisfy the conditions

{cov}_{p} ({\vec{X}}_{0, 3} (p), {\vec{X}}_{1, 3} (p); {\vec{Z}}_{3}) = {cov}_{p} ({\vec{X}}_{0, 3} (p), {\vec{X}}_{2, 3} (p); {\vec{Z}}_{3}) = {cov}_{p} ({\vec{X}}_{1, 3} (p), {\vec{X}}_{2, 3} (p); {\vec{Z}}_{3}) = 0 .

According to Theorem 4, every arbitrary vector of dimension 3 can be expanded in terms of the above vectors so that we have

\begin{matrix} \vec{A} = (a, b, c) = \frac{{cov}_{p} ({\vec{X}}_{0, 3} (p), \vec{A}; {\vec{Z}}_{3})}{{var}_{p} ({\vec{X}}_{0, 3} (p); {\vec{Z}}_{3})} (1, 0, 0) + \frac{{cov}_{p} ({\vec{X}}_{1, 3} (p), \vec{A}; {\vec{Z}}_{3})}{{var}_{p} ({\vec{X}}_{1, 3} (p); {\vec{Z}}_{3})} (\frac{2 p}{14 - p}, 1, 0) \\ + \frac{{cov}_{p} ({\vec{X}}_{2, 3} (p), \vec{A}; {\vec{Z}}_{3})}{{var}_{p} ({\vec{X}}_{2, 3} (p); {\vec{Z}}_{3})} (\frac{3 p}{14 - 5 p}, \frac{6 p}{14 - 5 p}, 1), \end{matrix}

(240)

in which

\begin{matrix} \frac{{cov}_{p} ({\vec{X}}_{0, 3} (p), \vec{A}; {\vec{Z}}_{3})}{{var}_{p} ({\vec{X}}_{0, 3} (p); {\vec{Z}}_{3})} = a - \frac{2 p}{14 - p} b - \frac{3 p}{14 - p} c, \\ \frac{{cov}_{p} ({\vec{X}}_{1, 3} (p), \vec{A}; {\vec{Z}}_{3})}{{var}_{p} ({\vec{X}}_{1, 3} (p); {\vec{Z}}_{3})} = b - \frac{6 p}{14 - 5 p} c, and \frac{{cov}_{p} ({\vec{X}}_{2, 3} (p), \vec{A}; {\vec{Z}}_{3})}{{var}_{p} ({\vec{X}}_{2, 3} (p); {\vec{Z}}_{3})} = c . \end{matrix}

(241)

For

p = 0

, the finite expansion (240) would reduce to an ordinary orthogonal expansion, while there is an important point for the case

p = 1

. We observe that replacing

p = 1

in the last item of (239) gives

{\vec{X}}_{2, 3} (1) = \frac{1}{3} (1, 2, 3) = \frac{1}{3} {\vec{Z}}_{3} .

Hence, in (240) (and subsequently (241)),

{cov}_{1} ({\vec{X}}_{2, 3} (1), \vec{A}; {\vec{Z}}_{3}) = {cov}_{1} ({\vec{X}}_{2, 3} (1), \vec{A}; 3 {\vec{X}}_{2, 3} (1)) = 0,

and

{var}_{1} ({\vec{X}}_{2, 3} (1); {\vec{Z}}_{3}) = {var}_{1} (\frac{1}{3} {\vec{Z}}_{3}; {\vec{Z}}_{3}) = 0,

which shows that the finite expansion (240) is not valid for the sole case

p = 1

, although it is valid for any other

p \in [0, 1)

. This is one of the reasons why we have considered this theory for every arbitrary parameter

p \in [0, 1]

. For a better analysis, see Figure 1 again. In general, we conjecture that

{\vec{X}}_{n, m} (1) = \frac{det ({\vec{V}}_{0, m}, {\vec{V}}_{1, m}, \dots, {\vec{V}}_{n - 1, m}, {\vec{V}}_{n, m})}{det ({\vec{V}}_{0, m}, {\vec{V}}_{1, m}, \dots, {\vec{V}}_{n - 1, m}, {\vec{Z}}_{m})} {\vec{Z}}_{m} (m = n + 1),

(242)

as for the above-mentioned example

det ({\vec{V}}_{0, 3}, {\vec{V}}_{1, 3}, {\vec{V}}_{2, 3}) = |\begin{matrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{matrix}| = 1 and det ({\vec{V}}_{0, 3}, {\vec{V}}_{1, 3}, {\vec{Z}}_{3}) = |\begin{matrix} 1 & 0 & 1 \\ 0 & 1 & 2 \\ 0 & 0 & 3 \end{matrix}| = 3 .

We symbolically examined such a conjecture (242) for the particular cases

n = 2, 3

, and the results were true. Now that the explicit forms of the p-uncorrelated vectors (239) are available, we can make three parametric orthogonal vectors based on them as follows:

\begin{matrix} {\vec{V}}_{0, 3} (p) & = {\vec{X}}_{0, 3} (p) - (1 - \sqrt{1 - p}) \frac{{\vec{X}}_{0, 3} (p) . {\vec{Z}}_{3}}{{\vec{Z}}_{3} . {\vec{Z}}_{3}} {\vec{Z}}_{3} \\ = \frac{1}{14} (13 + \sqrt{1 - p}, - 2 + 2 \sqrt{1 - p}, - 3 + 3 \sqrt{1 - p}), \\ {\vec{V}}_{1, 3} (p) & = \frac{1}{14 - p} (2 p - 2 + 2 \sqrt{1 - p}, - p + 10 + 4 \sqrt{1 - p}, - 6 + 6 \sqrt{1 - p}), \\ {\vec{V}}_{2, 3} (p) & = \frac{1}{14 - 5 p} (3 p - 3 + 3 \sqrt{1 - p}, 6 p - 6 + 6 \sqrt{1 - p}, - 5 p + 5 + 9 \sqrt{1 - p}) . \end{matrix}

(243)

Note in the above vectors that

{\vec{V}}_{0, 3} (0) = {\vec{V}}_{0, 3}, {\vec{V}}_{1, 3} (0) = {\vec{V}}_{1, 3} and {\vec{V}}_{2, 3} (0) = {\vec{V}}_{2, 3},

while for

p = 1

we have

{\vec{V}}_{0, 3} (1) = \frac{1}{14} (13, - 2, - 3), {\vec{V}}_{1, 3} (1) = \frac{3}{13} (0, 3, - 2) and {\vec{V}}_{2, 3} (1) = (0, 0, 0),

which confirms that it is not valid for choosing in the orthogonal vectors (243).

12. An Upper Bound for 1-Covariances

As the optimized case is

p = 1

, in this part, we are going to obtain an upper bound for

{cov}_{1} (X, Y; Z)

. Let

m_{X}, m_{Y}, M_{X}

and

M_{Y}

be real numbers, such that

m_{X} Z \leq X \leq M_{X} Z and m_{Y} Z \leq Y \leq M_{Y} Z .

(244)

It can be verified that the following identity holds true:

{var}_{1} (X; Z) = \frac{1}{E (Z^{2})} (M_{X} E (Z^{2}) - E (X Z)) (E (X Z) - m_{X} E (Z^{2})) - E ((M_{X} Z - X) (X - m_{X} Z)) .

(245)

Noting the conditions (244) and this fact that

E ((M_{X} Z - X) (X - m_{X} Z)) \geq 0,

equality (245) leads to the inequality

\begin{matrix} {var}_{1} (X; Z) \leq \frac{1}{E (Z^{2})} (M_{X} E (Z^{2}) - E (X Z)) (E (X Z) - m_{X} E (Z^{2})) \\ \leq \frac{1}{4 E (Z^{2})} {(M_{X} E (Z^{2}) - m_{X} E (Z^{2}))}^{2} = \frac{E (Z^{2})}{4} {(M_{X} - m_{X})}^{2} . \end{matrix}

(246)

On the other hand, following (246) in the well-known inequality

{cov}_{1}^{2} (X, Y; Z) \leq {var}_{1} (X; Z) {var}_{1} (Y; Z),

gives

\begin{matrix} {cov}_{1}^{2} (X, Y; Z) & \leq \frac{1}{E^{2} (Z^{2})} (M_{X} E (Z^{2}) - E (X Z)) (E (X Z) - m_{X} E (Z^{2})) \\ \times (M_{Y} E (Z^{2}) - E (Y Z)) (E (Y Z) - m_{Y} E (Z^{2})) \\ \leq \frac{E^{2} (Z^{2})}{16} {(M_{X} - m_{X})}^{2} {(M_{Y} - m_{Y})}^{2} . \end{matrix}

(247)

One of the direct consequences of (247) is that

|{cov}_{1} (X, Y; Z)| \leq \frac{E (Z^{2})}{4} (M_{X} - m_{X}) (M_{Y} - m_{Y}),

(248)

where the constant

1 / 4

in (248) is the best possible number in the sense that it cannot be replaced with a smaller quantity. As a particular case, if, in (248), we take

P_{r} (X = x) = \frac{w (x)}{\int_{α}^{β} w (x) d x}, X = f (x), Y = g (x) and Z = z (x) = 1,

it will reduce to the weighted Grüss inequality [29,30]

|\frac{\int_{α}^{β} w (x) f (x) g (x) d x}{\int_{α}^{β} w (x) d x} - \frac{(\int_{α}^{β} w (x) f (x) d x) (\int_{α}^{β} w (x) g (x) d x)}{{(\int_{α}^{β} w (x) d x)}^{2}}| \leq \frac{1}{4} (M_{f} - m_{f}) (M_{g} - m_{g}),

in which

m_{f} \leq f (x) \leq M_{f} and m_{g} \leq g (x) \leq M_{g} for all x \in [α, β] .

13. An Improvement to the Approximate Solutions of Over-Determined Systems

For

n > m

, consider the linear system of equations

\sum_{j = 1}^{m} a_{i, j} x_{j} = b_{i} (i = 1, 2, \dots, n),

(249)

whose matrix representation is as

A_{n \times m} X_{m \times 1} = B_{n \times 1}

where

A = [\begin{matrix} \begin{matrix} a_{11} \\ a_{21} \\ ⋮ \\ a_{n 1} \end{matrix} & \begin{matrix} a_{12} \\ a_{22} \\ ⋮ \\ a_{n 1} \end{matrix} & \begin{matrix} \dots \\ \dots \\ ⋮ \\ \dots \end{matrix} & \begin{matrix} a_{1 m} \\ a_{2 m} \\ ⋮ \\ a_{n m} \end{matrix} \end{matrix}], X = [\begin{matrix} x_{1} \\ x_{2} \\ ⋮ \\ x_{m} \end{matrix}] and B = [\begin{matrix} b_{1} \\ b_{2} \\ ⋮ \\ b_{n} \end{matrix}] .

As we mentioned in the introduction, the linear system (249) is called an over-determined system since the number of equations is more than the number of unknowns.

Such systems usually have no exact solution, and the goal is instead to find an approximate solution for the unknowns

{x_{j}}_{j = 1}^{m}

that fit equations in the sense of solving the problem

min_{{x_{j}}} E_{m, n} (x_{1}, \dots, x_{m}) = min_{{x_{j}}} \sum_{i = 1}^{n} {(\sum_{j = 1}^{m} a_{i, j} x_{j} - b_{i})}^{2} .

(250)

It has been proven [13] that the minimization problem (250) has a unique vector solution, provided that the m columns of the matrix A are linearly independent, which is given by solving the normal equations

A^{T} A \tilde{X} = A^{T} b,

where

A^{T}

indicates the matrix transpose of A, and

\tilde{X}

is the approximate solution of the least squares type expressed by

\tilde{X} = {(A^{T} A)}^{- 1} A^{T} b .

Instead of considering the problem (250), we would like now to consider the minimization problem

min_{{x_{j}}} V_{m, n} (x_{1}, \dots, x_{m}| {z_{i}}_{i = 1}^{n}) = min_{{x_{j}}} \sum_{i = 1}^{n} {(\sum_{j = 1}^{m} a_{i, j} x_{j} - b_{i})}^{2} - \frac{p}{\sum_{i = 1}^{n} z_{i}^{2}} {(\sum_{i = 1}^{n} z_{i} (\sum_{j = 1}^{m} a_{i, j} x_{j} - b_{i}))}^{2},

(251)

based on the fixed vector

Z_{1 \times n}^{T} = [z_{1}, z_{2}, \dots, z_{n}]

, where the quantity (251) is clearly smaller than the quantity (250) for any arbitrary selection of

{z_{i}}_{i = 1}^{n}

. In this direction,

\begin{matrix} \frac{\partial V_{m, n} (x_{1}, \dots, x_{m}| {z_{i}}_{i = 1}^{n})}{\partial x_{k}} \\ = 2 \sum_{i = 1}^{n} a_{i, k} (\sum_{j = 1}^{m} a_{i, j} x_{j} - b_{i}) - \frac{2 p}{\sum_{i = 1}^{n} z_{i}^{2}} (\sum_{i = 1}^{n} a_{i, k} z_{i}) (\sum_{i = 1}^{n} z_{i} (\sum_{j = 1}^{m} a_{i, j} x_{j} - b_{i})) = 0, \end{matrix}

leads to the linear system

\sum_{j = 1}^{m} (\sum_{i = 1}^{n} a_{i, k} a_{i, j} - p \frac{\sum_{i = 1}^{n} a_{i, k} z_{i} \sum_{i = 1}^{n} a_{i, j} z_{i}}{\sum_{i = 1}^{n} z_{i}^{2}}) x_{j} = \sum_{i = 1}^{n} a_{i, k} b_{i} - p \frac{\sum_{i = 1}^{n} a_{i, k} z_{i} \sum_{i = 1}^{n} b_{i} z_{i}}{\sum_{i = 1}^{n} z_{i}^{2}} (k = 1, 2, \dots, m),

(252)

which can also be represented as the matrix form

(A^{T} A - \frac{p}{Z^{T} Z} A^{T} Z Z^{T} A) {\tilde{X}}_{p, Z} = A^{T} B - \frac{p}{Z^{T} Z} A^{T} Z Z^{T} B,

with the solution

{\tilde{X}}_{p, Z} = {(A^{T} A - \frac{p}{Z^{T} Z} A^{T} Z Z^{T} A)}^{- 1} (A^{T} B - \frac{p}{Z^{T} Z} A^{T} Z Z^{T} B) .

(253)

A simple case of the approximate solution (253) is when

Z_{1 \times n}^{T} = I_{n} = [1, 1, \dots, 1]

and

p = 1

, i.e., an ordinary least variance problem such that (253) becomes

{\tilde{X}}_{1, I_{n}} = {(A^{T} A - \frac{1}{n} A^{T} I_{n}^{T} I_{n} A)}^{- 1} (A^{T} B - \frac{1}{n} A^{T} I_{n}^{T} I_{n} B) .

Example 3.

Suppose

m = 2

,

Z_{1 \times n}^{T} = I_{n} = [1, 1, \dots, 1]

and

p = 1

. Then, the corresponding over-determined system takes the simple form

a_{i, 1} x_{1} + a_{i, 2} x_{2} = b_{i} (i = 1, 2, \dots, n > 2),

and the problem (251) reduces to

min_{{x_{1}, x_{2}}} V_{2, n} (x_{1}, x_{2}| I_{n}) = min_{{x_{1}, x_{2}}} \sum_{i = 1}^{n} {(a_{i, 1} x_{1} + a_{i, 2} x_{2} - b_{i})}^{2} - \frac{1}{n} {(\sum_{i = 1}^{n} (a_{i, 1} x_{1} + a_{i, 2} x_{2} - b_{i}))}^{2} .

(254)

Hence, the explicit solutions of the system (252), i.e.,

\{\begin{matrix} (\sum_{i = 1}^{n} {(a_{i, 1})}^{2} - \frac{1}{n} {(\sum_{i = 1}^{n} a_{i, 1})}^{2}) x_{1} + (\sum_{i = 1}^{n} a_{i, 1} a_{i, 2} - \frac{1}{n} \sum_{i = 1}^{n} a_{i, 1} \sum_{i = 1}^{n} a_{i, 2}) x_{2} \\ = \sum_{i = 1}^{n} a_{i, 1} b_{i} - \frac{1}{n} \sum_{i = 1}^{n} a_{i, 1} \sum_{i = 1}^{n} b_{i}, \\ (\sum_{i = 1}^{n} a_{i, 2} a_{i, 1} - \frac{1}{n} \sum_{i = 1}^{n} a_{i, 2} \sum_{i = 1}^{n} a_{i, 1}) x_{1} + (\sum_{i = 1}^{n} {(a_{i, 2})}^{2} - \frac{1}{n} {(\sum_{i = 1}^{n} a_{i, 2})}^{2}) x_{2} \\ = \sum_{i = 1}^{n} a_{i, 2} b_{i} - \frac{1}{n} \sum_{i = 1}^{n} a_{i, 2} \sum_{i = 1}^{n} b_{i}, \end{matrix}

are, respectively,

x_{1} = \frac{\begin{matrix} (\sum_{i = 1}^{n} a_{i, 1} b_{i} - \frac{1}{n} \sum_{i = 1}^{n} a_{i, 1} \sum_{i = 1}^{n} b_{i}) (\sum_{i = 1}^{n} {(a_{i, 2})}^{2} - \frac{1}{n} {(\sum_{i = 1}^{n} a_{i, 2})}^{2}) \\ - (\sum_{i = 1}^{n} a_{i, 2} b_{i} - \frac{1}{n} \sum_{i = 1}^{n} a_{i, 2} \sum_{i = 1}^{n} b_{i}) (\sum_{i = 1}^{n} a_{i, 1} a_{i, 2} - \frac{1}{n} \sum_{i = 1}^{n} a_{i, 1} \sum_{i = 1}^{n} a_{i, 2}) \end{matrix}}{\begin{matrix} (\sum_{i = 1}^{n} {(a_{i, 1})}^{2} - \frac{1}{n} {(\sum_{i = 1}^{n} a_{i, 1})}^{2}) (\sum_{i = 1}^{n} {(a_{i, 2})}^{2} - \frac{1}{n} {(\sum_{i = 1}^{n} a_{i, 2})}^{2}) \\ - {(\sum_{i = 1}^{n} a_{i, 1} a_{i, 2} - \frac{1}{n} \sum_{i = 1}^{n} a_{i, 1} \sum_{i = 1}^{n} a_{i, 2})}^{2} \end{matrix}},

and

x_{2} = \frac{\begin{matrix} (\sum_{i = 1}^{n} a_{i, 2} b_{i} - \frac{1}{n} \sum_{i = 1}^{n} a_{i, 2} \sum_{i = 1}^{n} b_{i}) (\sum_{i = 1}^{n} {(a_{i, 1})}^{2} - \frac{1}{n} {(\sum_{i = 1}^{n} a_{i, 1})}^{2}) \\ - (\sum_{i = 1}^{n} a_{i, 1} b_{i} - \frac{1}{n} \sum_{i = 1}^{n} a_{i, 1} \sum_{i = 1}^{n} b_{i}) (\sum_{i = 1}^{n} a_{i, 1} a_{i, 2} - \frac{1}{n} \sum_{i = 1}^{n} a_{i, 1} \sum_{i = 1}^{n} a_{i, 2}) \end{matrix}}{\begin{matrix} (\sum_{i = 1}^{n} {(a_{i, 1})}^{2} - \frac{1}{n} {(\sum_{i = 1}^{n} a_{i, 1})}^{2}) (\sum_{i = 1}^{n} {(a_{i, 2})}^{2} - \frac{1}{n} {(\sum_{i = 1}^{n} a_{i, 2})}^{2}) \\ - {(\sum_{i = 1}^{n} a_{i, 1} a_{i, 2} - \frac{1}{n} \sum_{i = 1}^{n} a_{i, 1} \sum_{i = 1}^{n} a_{i, 2})}^{2} \end{matrix}},

while the approximate solutions corresponding to the well-known problem (250) are

{\tilde{x}}_{1} = \frac{(\sum_{i = 1}^{n} a_{i, 1} b_{i}) \sum_{i = 1}^{n} {(a_{i, 2})}^{2} - (\sum_{i = 1}^{n} a_{i, 2} b_{i}) \sum_{i = 1}^{n} a_{i, 1} a_{i, 2}}{\sum_{i = 1}^{n} {(a_{i, 1})}^{2} \sum_{i = 1}^{n} {(a_{i, 2})}^{2} - {(\sum_{i = 1}^{n} a_{i, 1} a_{i, 2})}^{2}},

and

{\tilde{x}}_{2} = \frac{(\sum_{i = 1}^{n} a_{i, 2} b_{i}) \sum_{i = 1}^{n} {(a_{i, 1})}^{2} - (\sum_{i = 1}^{n} a_{i, 1} b_{i}) \sum_{i = 1}^{n} a_{i, 1} a_{i, 2}}{\sum_{i = 1}^{n} {(a_{i, 1})}^{2} \sum_{i = 1}^{n} {(a_{i, 2})}^{2} - {(\sum_{i = 1}^{n} a_{i, 1} a_{i, 2})}^{2}} .

Let us compare these two series of solutions for a particular numerical case. If, for instance,

A_{4 \times 2} = [\begin{matrix} \begin{matrix} - 1 \\ 2 \\ 1 \\ - 1 \end{matrix} & \begin{matrix} 1 \\ - 1 \\ - 2 \\ 2 \end{matrix} \end{matrix}], X_{2 \times 1} = [\begin{matrix} x_{1} \\ x_{2} \end{matrix}] and B_{4 \times 1} = [\begin{matrix} 1 \\ 2 \\ 3 \\ 4 \end{matrix}],

then the solutions of the ordinary least squares problem are

({\tilde{x}}_{1}, {\tilde{x}}_{2}) = (\frac{9}{7}, 1),

while the solutions corresponding to the minimization problem (254) are

(x_{1} (p = 1; I_{4}), x_{2} (p = 1; I_{4})) = (\frac{8}{74}, \frac{13}{74}) .

By substituting such values into the remaining term

V_{2, 4} (x_{1}, x_{2}| I_{4}) = \sum_{i = 1}^{4} {(a_{i, 1} x_{1} + a_{i, 2} x_{2} - b_{i})}^{2} - \frac{1}{4} {(\sum_{i = 1}^{4} (a_{i, 1} x_{1} + a_{i, 2} x_{2} - b_{i}))}^{2},

we observe that

V_{2, 4} (\frac{9}{7}, 1| I_{4}) = \frac{53983}{7252} ≅ 7.4438,

whereas

V_{2, 4} (\frac{8}{74}, \frac{13}{74}| I_{4}) = \frac{35378}{7252} ≅ 4.8783 .

On the other hand, for the well-known remaining term

E_{2, 4} (x_{1}, x_{2}) = \sum_{i = 1}^{4} {(a_{i, 1} x_{1} + a_{i, 2} x_{2} - b_{i})}^{2},

we observe that

E_{2, 4} (\frac{9}{7}, 1) = \frac{185}{7} ≅ 26.4285,

whereas

E_{2, 4} (\frac{8}{74}, \frac{13}{74}) = \frac{80335}{2738} ≅ 29.3407 .

In conclusion,

V_{2, 4} (\frac{8}{74}, \frac{13}{74}| I_{4}) < V_{2, 4} (\frac{9}{7}, 1| I_{4}) < E_{2, 4} (\frac{9}{7}, 1),

which confirms inequality (16).

14. An Improvement of the Bessel Inequality and Parseval Identity

In general, two cases can be considered for this purpose.

14.1. First Type of Improvement

Let

{Φ_{k} (x)}_{k = 0}^{\infty}

be a sequence of continuous functions which are p-uncorrelated with respect to the fixed function

z (x)

and the probability density function

w (x) / \int_{a}^{b} w (x) d x

on

[a, b]

as before. Then, according to (44),

f (x) \sim \sum_{k = 0}^{\infty} \frac{{cov}_{p} (Φ_{k} (x), f (x); z (x))}{{var}_{p} (Φ_{k} (x); z (x))} Φ_{k} (x),

(255)

denotes a p-uncorrelated expansion for

f (x)

in which

\begin{matrix} \frac{{cov}_{p} (Φ_{k} (x), f (x); z (x))}{{var}_{p} (Φ_{k} (x); z (x))} = \\ \frac{\int_{a}^{b} w (x) z^{2} (x) d x \int_{a}^{b} w (x) Φ_{k} (x) f (x) d x - p \int_{a}^{b} w (x) Φ_{k} (x) z (x) d x \int_{a}^{b} w (x) f (x) z (x) d x}{\int_{a}^{b} w (x) z^{2} (x) d x \int_{a}^{b} w (x) Φ_{k}^{2} (x) d x - p {(\int_{a}^{b} w (x) Φ_{k} (x) z (x) d x)}^{2}} . \end{matrix}

Referring to Corollary 2 and relation (59), the following inequality holds true for the expansion (255):

0 \leq \sum_{k = 0}^{\infty} \frac{{cov}_{p}^{2} (Φ_{k} (x), f (x); z (x))}{{var}_{p} (Φ_{k} (x); z (x))} \leq {var}_{p} (f (x); z (x)) .

(256)

Also, according to the definition of convergence in p-variance, inequality (256) will be transformed to an equality if

lim_{n \to \infty} {var}_{p} (f (x) - \sum_{k = 0}^{n} \frac{{cov}_{p} (Φ_{k} (x), f (x); z (x))}{{var}_{p} (Φ_{k} (x); z (x))} Φ_{k} (x)) = 0,

(257)

which results in

\sum_{k = 0}^{\infty} \frac{{cov}_{p}^{2} (Φ_{k} (x), f (x); z (x))}{{var}_{p} (Φ_{k} (x); z (x))} = {var}_{p} (f (x); z (x)) .

(258)

If (257) or, equivalently, (258) is satisfied, the p-uncorrelated sequence

{Φ_{k} (x)}_{k = 0}^{\infty}

is said to be “complete” with respect to the fixed function

z (x)

, and the symbol “∼” in (255) will change to the equality. Noting the above comments, now let

f, g

be two expandable functions of type (255), and let

{Φ_{k} (x)}_{k = 0}^{\infty}

be a “complete” p-uncorrelated sequence. Since

f (x) = \sum_{k = 0}^{\infty} \frac{{cov}_{p} (Φ_{k} (x), f (x); z (x))}{{var}_{p} (Φ_{k} (x); z (x))} Φ_{k} (x),

and

g (x) = \sum_{k = 0}^{\infty} \frac{{cov}_{p} (Φ_{k} (x), g (x); z (x))}{{var}_{p} (Φ_{k} (x); z (x))} Φ_{k} (x),

thanks to the general identity

{cov}_{p} (\sum_{k = 0}^{n} a_{k} Φ_{k} (x), \sum_{j = 0}^{m} b_{j} Φ_{j} (x); z (x)) = \sum_{k = 0}^{n} \sum_{j = 0}^{m} a_{k} b_{j} {cov}_{p} (Φ_{k} (x), Φ_{j} (x); z (x)),

and the fact that

{cov}_{p} (Φ_{k} (x), Φ_{j} (x); z (x)) = {var}_{p} (Φ_{k} (x); z (x)) δ_{k, j},

we obtain

\begin{matrix} {cov}_{p} (f (x), g (x); z (x)) = \\ {cov}_{p} ((\sum_{k = 0}^{\infty} \frac{{cov}_{p} (Φ_{k} (x), f (x); z (x))}{{var}_{p} (Φ_{k} (x); z (x))} Φ_{k} (x)), (\sum_{k = 0}^{\infty} \frac{{cov}_{p} (Φ_{k} (x), g (x); z (x))}{{var}_{p} (Φ_{k} (x); z (x))} Φ_{k} (x)); z (x)) \\ = \sum_{k = 0}^{\infty} \frac{{cov}_{p} (Φ_{k} (x), f (x); z (x)) {cov}_{p} (Φ_{k} (x), g (x); z (x))}{{var}_{p} (Φ_{k} (x); z (x))} . \end{matrix}

(259)

which is an extension of the identity (258) for

f (x) = g (x)

. Also, for

p = 0

, this important identity leads to the generalized Parseval identity [16]

E (f (x) g (x)) = \sum_{k = 0}^{\infty} \frac{E (f (x) Φ_{k} (x)) E (g (x) Φ_{k} (x))}{E (Φ_{k}^{2} (x))} .

(260)

The finite type of (259) occurs when

f, g

and

{Φ_{k} (x)}_{k = 0}^{\infty}

are all polynomial functions. For example, let

Φ_{k} (x) = P_{k} (x; a, 0, c, 0)

denote the same as polynomials (224) satisfying

\begin{matrix} \int_{0}^{1} x^{a} P_{n} (x; a, 0, c, 0) P_{m} (x; a, 0, c, 0) d x \\ - (2 c - a + 1) \int_{0}^{1} x^{c} P_{n} (x; a, 0, c, 0) d x \int_{0}^{1} x^{c} P_{m} (x; a, 0, c, 0) d x \\ = \frac{1}{2 n + a + 1} {(\frac{(a - c) n!}{(c + 1) {(a + 1)}_{n}})}^{2} δ_{m, n} \Leftrightarrow 2 c - a + 1 > 0, a \neq c and a, c > - 1 . \end{matrix}

Also let

Q_{m} (x) = \sum_{k = 0}^{m} q_{k} x^{k} and R_{m} (x) = \sum_{k = 0}^{m} r_{k} x^{k},

be two arbitrary polynomials of the same degree. Since

\begin{matrix} Q_{m} (x) = \sum_{k = 0}^{m} (2 k + a + 1) {(\frac{(c + 1) {(a + 1)}_{k}}{(a - c) k!})}^{2} \\ \times {cov}_{1} (P_{k} (x; a, 0, c, 0), Q_{m} (x); x^{c - a}) P_{k} (x; a, 0, c, 0), \end{matrix}

and

\begin{matrix} R_{m} (x) = \sum_{k = 0}^{m} (2 k + a + 1) {(\frac{(c + 1) {(a + 1)}_{k}}{(a - c) k!})}^{2} \\ \times {cov}_{1} (P_{k} (x; a, 0, c, 0), R_{m} (x); x^{c - a}) P_{k} (x; a, 0, c, 0), \end{matrix}

according to (259), we have

\begin{matrix} {cov}_{1} (Q_{m} (x), R_{m} (x); x^{c - a}) = \sum_{k = 0}^{m} (2 k + a + 1) {(\frac{(c + 1) {(a + 1)}_{k}}{(a - c) k!})}^{2} \\ \times {cov}_{1} (P_{k} (x; a, 0, c, 0), Q_{m} (x); x^{c - a}) {cov}_{1} (P_{k} (x; a, 0, c, 0), R_{m} (x); x^{c - a}) . \end{matrix}

14.2. Second Type of Improvement

As inequality (57) is valid for any arbitrary selection of the coefficients

{α_{k}}_{k = 0}^{n}

, i.e.,

0 \leq {var}_{p} (Y - \sum_{k = 0}^{n} α_{k} X_{k}; Z) \leq E ({(Y - \sum_{k = 0}^{n} α_{k} X_{k})}^{2}),

(261)

such kind of inequalities can be applied for orthogonal expansions. Suppose that

{Φ_{k} (x)}_{k = 0}^{\infty}

is a sequence of continuous functions orthogonal with respect to the weight function

w (x)

on

[a, b]

. If

f (x)

is a piecewise continuous function, then

f (x) \sim \sum_{k = 0}^{\infty} α_{k} Φ_{k} (x) with α_{k} = \frac{{〈f, Φ_{k}〉}_{w}}{{〈Φ_{k}, Φ_{k}〉}_{w}},

is known as its corresponding orthogonal expansion in which

{〈f, g〉}_{w} = \int_{a}^{b} w (x) f (x) g (x) d x

. The positive quantity

S_{n} = \int_{a}^{b} w (x) {(\sum_{k = 0}^{n} α_{k} Φ_{k} (x) - f (x))}^{2} d x,

(262)

will eventually lead to the Bessel inequality [15]

0 \leq S_{n} = {〈f, f〉}_{w} - \sum_{k = 0}^{n} \frac{{〈f, Φ_{k}〉}_{w}^{2}}{{〈Φ_{k}, Φ_{k}〉}_{w}} .

Now, noting (261) and (262), instead of

S_{n}

, we define the following positive quantity:

\begin{matrix} V_{n} (p; z (x)) = S_{n} - R_{n}^{2} (p; z (x)) \\ = \int_{a}^{b} w (x) {(\sum_{k = 0}^{n} α_{k} Φ_{k} (x) - f (x))}^{2} d x - \frac{p}{\int_{a}^{b} w (x) z^{2} (x) d x} {(\int_{a}^{b} w (x) z (x) (\sum_{k = 0}^{n} α_{k} Φ_{k} (x) - f (x)) d x)}^{2} . \end{matrix}

It is clear that

0 \leq V_{n} (p; z (x)) \leq S_{n} .

(263)

Therefore,

\begin{matrix} 0 \leq V_{n} (p; z (x)) = {〈f, f〉}_{w} - \sum_{k = 0}^{n} \frac{{〈f, Φ_{k}〉}_{w}^{2}}{{〈Φ_{k}, Φ_{k}〉}_{w}} \\ - \frac{p}{{〈z, z〉}_{w}} ({(\sum_{k = 0}^{n} \frac{{〈f, Φ_{k}〉}_{w} {〈z, Φ_{k}〉}_{w}}{{〈Φ_{k}, Φ_{k}〉}_{w}})}^{2} + {〈f, z〉}_{w}^{2} - 2 {〈f, z〉}_{w} \sum_{k = 0}^{n} \frac{{〈f, Φ_{k}〉}_{w} {〈z, Φ_{k}〉}_{w}}{{〈Φ_{k}, Φ_{k}〉}_{w}}), \end{matrix}

can be rewritten as

\begin{matrix} {〈z, z〉}_{w} {〈f, f〉}_{w} - p {〈f, z〉}_{w}^{2} \geq {〈z, z〉}_{w} \sum_{k = 0}^{n} \frac{{〈f, Φ_{k}〉}_{w}^{2}}{{〈Φ_{k}, Φ_{k}〉}_{w}} + p {(\sum_{k = 0}^{n} \frac{{〈f, Φ_{k}〉}_{w} {〈z, Φ_{k}〉}_{w}}{{〈Φ_{k}, Φ_{k}〉}_{w}})}^{2} \\ - 2 p {〈f, z〉}_{w} \sum_{k = 0}^{n} \frac{{〈f, Φ_{k}〉}_{w} {〈z, Φ_{k}〉}_{w}}{{〈Φ_{k}, Φ_{k}〉}_{w}} . \end{matrix}

(264)

Inequality (264) is an improvement of the well-known Bessel inequality for every

p \in [0, 1]

with respect to the fixed function

z (x)

. For example, if

Φ_{k} (x) = sin (k + 1) x

for

x \in [0, π]

and

w (x) = z (x) = 1

are replaced in (264), the Bessel inequality of the Fourier sine expansion will be improved as follows

\begin{matrix} \int_{0}^{π} f^{2} (x) d x - \frac{p}{π} {(\int_{0}^{π} f (x) d x)}^{2} \geq \frac{2}{π} \sum_{k = 0}^{n} {(\int_{0}^{π} f (x) sin (k + 1) x d x)}^{2} \\ + \frac{4 p}{π^{3}} {(\sum_{k = 0}^{n} \frac{1 + {(- 1)}^{k}}{k + 1} \int_{0}^{π} f (x) sin (k + 1) x d x)}^{2} \\ - \frac{4 p}{π^{2}} (\int_{0}^{π} f (x) d x) \sum_{k = 0}^{n} \frac{1 + {(- 1)}^{k}}{k + 1} \int_{0}^{π} f (x) sin (k + 1) x d x . \end{matrix}

Obviously (264) will remain an inequality if the sequence

{Φ_{k} (x)}_{k = 0}

does not form a complete orthogonal system. On the other side, if they are a complete orthogonal set, inequality (264) becomes an equality when

n \to \infty

. In other words, suppose

{Φ_{k} (x)}_{k = 0}

is a complete orthogonal set. Since

lim_{n \to \infty} S_{n} = lim_{n \to \infty} \int_{a}^{b} w (x) {(\sum_{k = 0}^{n} α_{k} Φ_{k} (x) - f (x))}^{2} d x = 0,

we directly conclude from (263) that

0 \leq lim_{n \to \infty} V_{n} (p; z (x)) \leq lim_{n \to \infty} S_{n} = 0,

and, therefore,

lim_{n \to \infty} S_{n} - R_{n}^{2} (p; z (x)) = lim_{n \to \infty} {(\int_{a}^{b} w (x) z (x) (\sum_{k = 0}^{n} α_{k} Φ_{k} (x) - f (x)) d x)}^{2} = 0,

eventually yields

\sum_{k = 0}^{\infty} \frac{{〈f, Φ_{k}〉}_{w} {〈z, Φ_{k}〉}_{w}}{{〈Φ_{k}, Φ_{k}〉}_{w}} = {〈f, z〉}_{w},

which is known in the literature as the inner product form of the generalized Parseval identity (260). In the next section, we will refer to the above-mentioned results in order to extend the presented theory in terms of a set of fixed mutually orthogonal variables.

15. Least p-Variances with Respect to Fixed Orthogonal Variables

Since parts of this section are somewhat similar to the previous ones, we just state basic concepts. Suppose

x, y

and

{z_{k}}_{k = 1}^{m}

are elements of an inner product space

S

, such that

{z_{k}}_{k = 1}^{m}

are mutually orthogonal as

〈z_{i}, z_{j}〉 = 〈z_{j}, z_{j}〉 δ_{i, j} .

(265)

Due to the orthogonality property (265), the following identity holds true

\begin{matrix} 〈x - (1 - \sqrt{1 - p}) \sum_{k = 1}^{m} \frac{〈x, z_{k}〉}{〈z_{k}, z_{k}〉} z_{k}, y - (1 - \sqrt{1 - p}) \sum_{k = 1}^{m} \frac{〈y, z_{k}〉}{〈z_{k}, z_{k}〉} z_{k}〉 \\ = 〈x, y〉 - p \sum_{k = 1}^{m} \frac{〈x, z_{k}〉 〈y, z_{k}〉}{〈z_{k}, z_{k}〉}, \end{matrix}

(266)

and, for

y = x

, it gives

〈x, x〉 - p \sum_{k = 1}^{m} \frac{{〈x, z_{k}〉}^{2}}{〈z_{k}, z_{k}〉} \geq 0 \forall p \in [0, 1] .

(267)

The identity (266) and inequality (267) can again be employed in mathematical statistics concepts.

Definition 6.

Let

X, Y

and

{Z_{k}}_{k = 1}^{m}

be arbitrary random variables, such that

{Z_{k}}_{k = 1}^{m}

are mutually orthogonal, i.e.,

E (Z_{i} Z_{j}) = E (Z_{j}^{2}) δ_{i, j}, i, j = 1, 2, \dots, m .

(268)

Corresponding to (266), we define

\begin{matrix} {cov}_{p} (X, Y; {Z_{k}}_{k = 1}^{m}) \\ = E ((X - (1 - \sqrt{1 - p}) \sum_{k = 1}^{m} \frac{E (X Z_{k})}{E (Z_{k}^{2})} Z_{k}) (Y - (1 - \sqrt{1 - p}) \sum_{k = 1}^{m} \frac{E (Y Z_{k})}{E (Z_{k}^{2})} Z_{k})) \\ = E (X Y) - p \sum_{k = 1}^{m} \frac{E (X Z_{k}) E (Y Z_{k})}{E (Z_{k}^{2})}, \end{matrix}

(269)

and call it the “p-covariance of X and Y with respect to the fixed orthogonal variables

{Z_{k}}_{k = 1}^{m}

”.

For

Y = X

, (269) changes to

\begin{matrix} {var}_{p} (X; {Z_{k}}_{k = 1}^{m}) & = E ({(X - (1 - \sqrt{1 - p}) \sum_{k = 1}^{m} \frac{E (X Z_{k})}{E (Z_{k}^{2})} Z_{k})}^{2}) \\ = E (X^{2}) - p \sum_{k = 1}^{m} \frac{E^{2} (X Z_{k})}{E (Z_{k}^{2})} \geq 0, \end{matrix}

(270)

where

{E (Z_{k}^{2})}_{k = 1}^{m}

are all positive. Note in (269) that

\sum_{k = 1}^{m} \frac{E (X Z_{k})}{E (Z_{k}^{2})} Z_{k} = \sum_{k = 1}^{m} pro j_{Z_{k}} X,

and, therefore, e.g., for

p = 1

,

{cov}_{1} (X, Y; {Z_{k}}_{k = 1}^{m}) = E ((X - \sum_{k = 1}^{m} pro j_{Z_{k}} X) (Y - \sum_{k = 1}^{m} pro j_{Z_{k}} Y)) .

Moreover, for orthogonal variables

{Z_{k}}_{k = 1}^{m}

and

p \in [0, 1]

, we have

0 \leq {var}_{1} (X; {Z_{k}}_{k = 1}^{m}) \leq {var}_{p} (X; {Z_{k}}_{k = 1}^{m}) \leq {var}_{0} (X; {Z_{k}}_{k = 1}^{m}) = E (X^{2}) .

(271)

A remarkable point in (271) is that, if

m, n

are two natural numbers, such that

n > m

, then

0 \leq {var}_{p} (X; {Z_{k}}_{k = 1}^{n}) \leq {var}_{p} (X; {Z_{k}}_{k = 1}^{m}),

which can be proved directly via (270). For instance, if

n = 2

and

m = 1

, then

0 \leq {var}_{1} (X; Z_{1}, Z_{2}) \leq {var}_{p} (X; Z_{1}, Z_{2}) \leq {var}_{p} (X; Z_{1}) \leq {var}_{0} (X; Z_{1}) = E (X^{2}),

in which

E (Z_{1} Z_{2}) = 0 .

The following properties hold true for definitions (269) and (270), provided that the orthogonal condition (268) is satisfied:

\begin{matrix} (b 1) & {cov}_{p} (X, Y; {Z_{k}}_{k = 1}^{m}) = {cov}_{p} (Y, X; {Z_{k}}_{k = 1}^{m}) . \\ (b 2) & {cov}_{p} (α X, β Y; {Z_{k}}_{k = 1}^{m}) = α β {cov}_{p} (X, Y; {Z_{k}}_{k = 1}^{m}) (α, β \in R) . \\ (b 3) & \begin{matrix} {cov}_{p} (X + α, Y + β; {Z_{k}}_{k = 1}^{m}) = {cov}_{p} (X, Y; {Z_{k}}_{k = 1}^{m}) + α {cov}_{p} (1, Y; {Z_{k}}_{k = 1}^{m}) \\ + β {cov}_{p} (X, 1; {Z_{k}}_{k = 1}^{m}) + α β {cov}_{p} (1, 1; {Z_{k}}_{k = 1}^{m}) . \end{matrix} \\ (b 4) & {cov}_{p} (\sum_{k = 0}^{n} c_{k} X_{k}, X_{j}; {Z_{k}}_{k = 1}^{m}) = \sum_{k = 0}^{n} c_{k} {cov}_{p} (X_{k}, X_{j}; {Z_{k}}_{k = 1}^{m}) ({c_{k}}_{k = 0}^{n} \in R) . \\ (b 5) & \begin{matrix} {var}_{p} (α X + β Y; {Z_{k}}_{k = 1}^{m}) = α^{2} {var}_{p} (X; {Z_{k}}_{k = 1}^{m}) + β^{2} {var}_{p} (Y; {Z_{k}}_{k = 1}^{m}) \\ + 2 α β {cov}_{p} (X, Y; {Z_{k}}_{k = 1}^{m}) . \end{matrix} \end{matrix}

Definition 7.

Based on definitions (269) and (270) and the orthogonality condition (268), we define

ρ_{p} (X, Y; {Z_{k}}_{k = 1}^{m}) = \frac{{cov}_{p} (X, Y; {Z_{k}}_{k = 1}^{m})}{\sqrt{{var}_{p} (X; {Z_{k}}_{k = 1}^{m}) {var}_{p} (Y; {Z_{k}}_{k = 1}^{m})}},

(272)

and call it “p-correlation coefficient of X and Y with respect to the fixed orthogonal variables

{Z_{k}}_{k = 1}^{m}

”. Clearly,

ρ_{p} (X, Y; {Z_{k}}_{k = 1}^{m}) \in [- 1, 1],

because, if

U = X - (1 - \sqrt{1 - p}) \sum_{k = 1}^{m} \frac{E (X Z_{k})}{E (Z_{k}^{2})} Z_{k} and V = Y - (1 - \sqrt{1 - p}) \sum_{k = 1}^{m} \frac{E (Y Z_{k})}{E (Z_{k}^{2})} Z_{k},

are replaced in the Cauchy–Schwarz inequality (11), then

{cov}_{p}^{2} (X, Y; {Z_{k}}_{k = 1}^{m}) \leq {var}_{p} (X; {Z_{k}}_{k = 1}^{m}) {var}_{p} (Y; {Z_{k}}_{k = 1}^{m}) .

Definition 8.

If

ρ_{p} (X, Y; {Z_{k}}_{k = 1}^{m}) = 0

in (272), we say that X and Y are p-uncorrelated with respect to

{Z_{k}}_{k = 1}^{m}

, and we have

E (X Y) = p \sum_{k = 1}^{m} \frac{E (X Z_{k}) E (Y Z_{k})}{E (Z_{k}^{2})} .

15.1. Least p-Variances Approximation Based on Fixed Orthogonal Variables

Again, consider the approximation (13),

Y ≅ \sum_{k = 0}^{n} c_{k} X_{k},

and define the p-variance of the remaining term

R (c_{0}, c_{1}, \dots, c_{n}) = \sum_{k = 0}^{n} c_{k} X_{k} - Y,

with respect to the orthogonal variables

{Z_{k}}_{k = 1}^{m}

as follows:

\begin{matrix} {var}_{p} (R (c_{0}, \dots, c_{n}); {Z_{k}}_{k = 1}^{m}) \\ = E ({(R (c_{0}, \dots, c_{n}) - (1 - \sqrt{1 - p}) \sum_{k = 1}^{m} \frac{E (Z_{k} R (c_{0}, \dots, c_{n}))}{E (Z_{k}^{2})} Z_{k})}^{2}) . \end{matrix}

(273)

To minimize (273), the relations

\frac{\partial {var}_{p} (R (c_{0}, \dots, c_{n}); {Z_{k}}_{k = 1}^{m})}{\partial c_{j}} = 0 for j = 0, 1, \dots, n,

eventually lead to the following linear system:

\begin{matrix} [\begin{matrix} \begin{matrix} {var}_{p} (X_{0}; {Z_{k}}_{k = 1}^{m}) \\ {cov}_{p} (X_{1}, X_{0}; {Z_{k}}_{k = 1}^{m}) \\ ⋮ \\ {cov}_{p} (X_{n}, X_{0}; {Z_{k}}_{k = 1}^{m}) \end{matrix} & \begin{matrix} {cov}_{p} (X_{0}, X_{1}; {Z_{k}}_{k = 1}^{m}) \\ {var}_{p} (X_{1}; {Z_{k}}_{k = 1}^{m}) \\ ⋮ \\ {cov}_{p} (X_{n}, X_{1}; {Z_{k}}_{k = 1}^{m}) \end{matrix} & \begin{matrix} \dots \\ \dots \\ ⋮ \\ \dots \end{matrix} & \begin{matrix} {cov}_{p} (X_{0}, X_{n}; {Z_{k}}_{k = 1}^{m}) \\ {cov}_{p} (X_{1}, X_{n}; {Z_{k}}_{k = 1}^{m}) \\ ⋮ \\ {var}_{p} (X_{n}; {Z_{k}}_{k = 1}^{m}) \end{matrix} \end{matrix}] \\ \times [\begin{matrix} c_{0} \\ c_{1} \\ ⋮ \\ c_{n} \end{matrix}] = [\begin{matrix} {cov}_{p} (X_{0}, Y; {Z_{k}}_{k = 1}^{m}) \\ {cov}_{p} (X_{1}, Y; {Z_{k}}_{k = 1}^{m}) \\ ⋮ \\ {cov}_{p} (X_{n}, Y; {Z_{k}}_{k = 1}^{m}) \end{matrix}] . \end{matrix}

(274)

Two continuous and discrete spaces can be considered for the system (274).

15.1.1. First Case

If

{\{X_{k} = Φ_{k} (x)\}}_{k = 0}^{n}

,

Y = f (x)

and the orthogonal set

{\{Z_{k} = z_{k} (x)\}}_{k = 1}^{m}

are defined in a continuous space with a probability density function as

P_{r} (X = x) = \frac{w (x)}{\int_{a}^{b} w (x) d x},

the elements of the system (274) appear as

\begin{matrix} {cov}_{p} (Φ_{i} (x), Φ_{j} (x); {\{z_{k} (x)\}}_{k = 1}^{m}) = \\ \frac{1}{\int_{a}^{b} w (x) d x} (\int_{a}^{b} w (x) Φ_{i} (x) Φ_{j} (x) d x - p \sum_{k = 1}^{m} \frac{\int_{a}^{b} w (x) Φ_{i} (x) z_{k} (x) d x \int_{a}^{b} w (x) Φ_{j} (x) z_{k} (x) d x}{\int_{a}^{b} w (x) z_{k}^{2} (x) d x}), \end{matrix}

if only if

\int_{a}^{b} w (x) z_{i} (x) z_{j} (x) d x = (\int_{a}^{b} w (x) z_{j}^{2} (x) d x) δ_{i, j} .

15.1.2. Second Case

If the above-mentioned variables are defined on a counter set, say

A^{*} = {x_{k}}_{k = 0}^{m}

, with a discrete probability density function as

P_{r} (X = x) = \frac{j (x)}{\sum_{x \in A^{*}} j (x)},

then

\begin{matrix} {cov}_{p} (Φ_{i} (x), Φ_{j} (x); {z_{k} (x)}_{k = 1}^{m}) = \\ \frac{1}{\sum_{x \in A^{*}} j (x)} (\sum_{x \in A^{*}} j (x) Φ_{i} (x) Φ_{j} (x) - p \sum_{k = 1}^{m} \frac{\sum_{x \in A^{*}} j (x) Φ_{i} (x) z_{k} (x) \sum_{x \in A^{*}} j (x) Φ_{j} (x) z_{k} (x)}{\sum_{x \in A^{*}} j (x) z_{k}^{2} (x)}), \end{matrix}

only if

\sum_{x \in A^{*}} j (x) z_{i} (x) z_{j} (x) = (\sum_{x \in A^{*}} j (x) z_{i}^{2} (x)) δ_{i, j} .

Let us consider a particular example. Suppose that

m = 2

,

x \in [0, π]

and

P_{r} (X = x) = \frac{1}{π}

. In this case, it is well known that, if we take

z_{1} (x) = sin x

and

z_{2} (x) = cos x

, then

E (z_{1} (x) z_{2} (x)) = \frac{1}{π} \int_{0}^{π} sin x cos x d x = 0,

and, therefore,

\begin{matrix} {cov}_{p} (Φ_{i} (x), Φ_{j} (x); sin x, cos x) = \frac{1}{π} \int_{0}^{π} Φ_{i} (x) Φ_{j} (x) d x \\ - \frac{2 p}{π^{2}} (\int_{0}^{π} Φ_{i} (x) sin x d x \int_{0}^{π} Φ_{j} (x) sin x d x + \int_{0}^{π} Φ_{i} (x) cos x d x \int_{0}^{π} Φ_{j} (x) cos x d x) . \end{matrix}

The weighted version of the above example can be considered in various cases. For example, if

w (x) = e^{x}, z_{1} (x) = e^{- x} sin x

and

z_{2} (x) = cos x

all defined on

[0, π]

, then

E (z_{1} (x) z_{2} (x)) = \frac{1}{e^{π} - 1} \int_{0}^{π} sin x cos x d x = 0,

and consequently the corresponding space is defined as

\begin{matrix} (e^{π} - 1) {cov}_{p} (Φ_{i} (x), Φ_{j} (x); sin x, cos x) = \int_{0}^{π} e^{x} Φ_{i} (x) Φ_{j} (x) d x \\ - p (\frac{\int_{0}^{π} Φ_{i} (x) sin x d x \int_{0}^{π} Φ_{j} (x) sin x d x}{\int_{0}^{π} e^{- x} {sin}^{2} x d x} + \frac{\int_{0}^{π} Φ_{i} (x) e^{x} cos x d x \int_{0}^{π} Φ_{j} (x) e^{x} cos x d x}{\int_{0}^{π} e^{x} {cos}^{2} x d x}), \end{matrix}

where

\int_{0}^{π} e^{- x} {sin}^{2} x d x = \frac{2}{5} (1 - e^{- π}) and \int_{0}^{π} e^{x} {cos}^{2} x d x = \frac{3}{5} (e^{π} - 1) .

In the sequel, applying the uncorrelatedness condition

{cov}_{p} (X_{i}, X_{j}; {Z_{k}}_{k = 1}^{m}) = {var}_{p} (X_{j}; {Z_{k}}_{k = 1}^{m}) δ_{i, j} for every i, j = 0, 1, \dots, n,

(275)

on the elements of the linear system (274), we can obtain the unknown coefficients as

c_{k} = \frac{{cov}_{p} (X_{k}, Y; {Z_{k}}_{k = 1}^{m})}{{var}_{p} (X_{k}; {Z_{k}}_{k = 1}^{m})} .

In this case,

Y ≅ \sum_{k = 0}^{n} \frac{{cov}_{p} (X_{k}, Y; {Z_{k}}_{k = 1}^{m})}{{var}_{p} (X_{k}; {Z_{k}}_{k = 1}^{m})} X_{k},

is the best approximation in the sense of least p-variance of the error with respect to the fixed orthogonal variables

{Z_{k}}_{k = 1}^{m}

.

Finally, we point out that p-uncorrelated vectors with respect to fixed orthogonal vectors can be constructed as follows. Let

{\vec{A}}_{m} = (a_{1}, a_{2}, \dots, a_{m})

and

{\vec{B}}_{m} = (b_{1}, b_{2}, \dots, b_{m})

be two arbitrary vectors, and let

{\{{\vec{Z}}_{k, m} = (z_{k, 1}, z_{k, 2}, \dots, z_{k, m})\}}_{k = 1}^{l}

be a set of fixed orthogonal vectors. Then,

\begin{matrix} ({\vec{A}}_{m} - (1 - \sqrt{1 - p}) \sum_{k = 1}^{l} \frac{{\vec{A}}_{m} . {\vec{Z}}_{k, m}}{{\vec{Z}}_{k, m} . {\vec{Z}}_{k, m}} {\vec{Z}}_{k, m}) . ({\vec{B}}_{m} - (1 - \sqrt{1 - p}) \sum_{k = 1}^{l} \frac{{\vec{B}}_{m} . {\vec{Z}}_{k, m}}{{\vec{Z}}_{k, m} . {\vec{Z}}_{k, m}} {\vec{Z}}_{k, m}) \\ = {\vec{A}}_{m} . {\vec{B}}_{m} - p \sum_{k = 1}^{l} \frac{({\vec{A}}_{m} . {\vec{Z}}_{k, m}) ({\vec{B}}_{m} . {\vec{Z}}_{k, m})}{{\vec{Z}}_{k, m} . {\vec{Z}}_{k, m}}, \end{matrix}

if only if

{\vec{Z}}_{k, m} . {\vec{Z}}_{j, m} = ({\vec{Z}}_{j, m} . {\vec{Z}}_{j, m}) δ_{k, j}

.

Funding

This work was supported by the Alexander von Humboldt Foundation under the following grant number: Ref 3.4-IRN-1128637-GF-E.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the author.

Acknowledgments

Special thanks to three anonymous referees for their valuable comments and suggestions.

Conflicts of Interest

The author declares that he has no competing interest.

References

Van De Geer, S. A new approach to least-squares estimation, with applications. Ann. Stat. 1987, 15, 587–602. [Google Scholar]
Kariya, T.; Kurata, H. Generalized Least Squares; John Wiley & Sons: Hoboken, NJ, USA, 2004. [Google Scholar]
Park, T.; Casella, G. The bayesian lasso. J. Am. Stat. Assoc. 2008, 103, 681–686. [Google Scholar] [CrossRef]
Wolberg, J. Data Analysis Using the Method of Least Squares: Extracting the Most Information from Experiments; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2006. [Google Scholar]
Lancaster, P.; Salkauskas, K. Surfaces generated by moving least squares methods. Math. Comput. 1981, 37, 141–158. [Google Scholar] [CrossRef]
Shepard, D. A two-dimensional interpolation function for irregularly-spaced data. In Proceedings of the 1968 23rd ACM National Conference, New York, NY, USA, 27–29 August 1968; pp. 517–524. [Google Scholar]
Yang, H.; Wang, H.; Li, B. Analysis of Meshfree Galerkin Methods Based on Moving Least Squares and Local Maximum-Entropy Approximation Schemes. Mathematics 2024, 12, 494. [Google Scholar] [CrossRef]
Guo, L.; Narayan, A.; Zhou, T. Constructing least-squares polynomial approximations. SIAM Rev. 2020, 62, 483–508. [Google Scholar] [CrossRef]
Su, Y.; Li, M.; Yan, C.; Zhang, T.; Tang, H.; Li, H. Quantitative analysis of biodiesel adulterants using raman spectroscopy combined with synergy interval partial least squares (siPLS) algorithms. Appl. Sci. 2023, 13, 11306. [Google Scholar] [CrossRef]
Legendre, A.M. Nouvelles Méthodes la Determination des Orbites des Comètes: Part 1–2: Mit Einem Supplement; Didot: Paris, France, 1805. [Google Scholar]
Stigler, S.M. Gauss and the invention of least squares. Ann. Stat. 1981, 9, 465–474. [Google Scholar] [CrossRef]
Rao, C.R. Linear Models and Generalizations; Springer: Berlin/Heidelberg, Germany, 2008. [Google Scholar]
Björck, Å. Numerical Methods for Least Squares Problems; SIAM: Philadelphia, PA, USA, 2024. [Google Scholar]
Masjed-Jamei, M. A functional generalization of the Cauchy–Schwarz inequality and some subclasses. Appl. Math. Lett. 2009, 22, 1335–1339. [Google Scholar] [CrossRef]
Powell, M.J.D. Approximation Theory and Methods; Cambridge University Press: Cambridge, UK, 1981. [Google Scholar]
Davis, P.J. Interpolation and Approximation; Courier Corporation: Chelmsford, MA, USA, 1975. [Google Scholar]
Brezinski, C. Biorthogonality and Its Applications to Numerical Analysis; CRC Press: Boca Raton, FL, USA, 2020. [Google Scholar]
Iserles, A.; Nørsett, S. On the theory of biorthogonal polynomials. Trans. Am. Math. Soc. 1988, 306, 455–474. [Google Scholar] [CrossRef]
Masjed-Jamei, M. A basic class of symmetric orthogonal polynomials using the extended Sturm–Liouville theorem for symmetric functions. J. Math. Anal. Appl. 2007, 325, 753–775. [Google Scholar] [CrossRef]
Masjed-Jamei, M. Special Functions and Generalized Sturm-Liouville Problems; Springer Nature: Berlin/Heidelberg, Germany, 2020. [Google Scholar]
Area, I. Hypergeometric multivariate orthogonal polynomials. In Proceedings of the Orthogonal Polynomials: 2nd AIMS-Volkswagen Stiftung Workshop, Douala, Cameroon, 5–12 October 2018; Springer: Berlin/Heidelberg, Germany, 2020; pp. 165–193. [Google Scholar]
Milovanović, G.V. Some orthogonal polynomials on the finite interval and Gaussian quadrature rules for fractional Riemann-Liouville integrals. Math. Methods Appl. Sci. 2021, 44, 493–516. [Google Scholar] [CrossRef]
Milovanović, G.V. On the Markov extremal problem in the L2-norm with the classical weight functions. arXiv 2021, arXiv:2111.01094. [Google Scholar]
Milovanovic, G.V. Orthogonality on the semicircle: Old and new results. Electron. Trans. Numer. Anal. 2023, 59, 99–115. [Google Scholar] [CrossRef]
Chihara, T.S. An Introduction to Orthogonal Polynomials; Courier Corporation: Chelmsford, MA, USA, 2011. [Google Scholar]
Koepf, W. Hypergeometric Identities. In Hypergeometric Summation: An Algorithmic Approach to Summation and Special Function Identities; Springer: Berlin/Heidelberg, Germany, 2014; pp. 11–33. [Google Scholar]
Koepf, W. Power series in computer algebra. J. Symb. Comput. 1992, 13, 581–603. [Google Scholar] [CrossRef][Green Version]
Masjed-Jamei, M. Biorthogonal Exponential Sequences with Weight Function exp (ax²+ ibx) on the Real Line and an Orthogonal Sequence of Trigonometric Functions. Proc. Am. Math. Soc. 2008, 136, 409–417. [Google Scholar] [CrossRef]
Masjed-Jamei, M. A linear constructive approximation for integrable functions and a parametric quadrature model based on a generalization of Ostrowski-Gruss type inequalities. Electron. Trans. Numer. Anal. 2011, 38, 218–232. [Google Scholar]
Masjed-Jamei, M. A certain class of weighted approximations for integrable functions and applications. Numer. Funct. Anal. Optim. 2013, 34, 1224–1244. [Google Scholar] [CrossRef]

Table 1. Characteristics of ten sequences of classical orthogonal polynomials.

Symbol	Weight Function	Kind, Interval & Parameters Constraint
$P_{n}^{(u, v)} (x)$	$W (\begin{matrix} \begin{matrix} - u - v, & - u + v \end{matrix} \\ \begin{matrix} - 1, & 0, & 1 \end{matrix} \end{matrix}\| x) = {(1 - x)}^{u} {(1 + x)}^{v}$	Infinite, $[- 1, 1]$ $u > - 1, v > - 1$
$L_{n}^{(u)} (x)$	$W (\begin{matrix} \begin{matrix} - 1, & u \end{matrix} \\ \begin{matrix} 0, & 1, & 0 \end{matrix} \end{matrix}\| x) = x^{u} exp (- x)$	$Infinite, [0, \infty)$ $u > - 1$
$H_{n} (x)$	$W (\begin{matrix} \begin{matrix} - 2, & 0 \end{matrix} \\ \begin{matrix} 0, & 0, & 1 \end{matrix} \end{matrix}\| x) = exp (- x^{2})$	$Infinite, (- \infty, \infty)$ -
$J_{n}^{(u, v)} (x)$	$W (\begin{matrix} \begin{matrix} - 2 u, & v \end{matrix} \\ \begin{matrix} 1, & 0, & 1 \end{matrix} \end{matrix}\| x) = {(1 + x^{2})}^{- u} exp (v arctan x)$	$Finite, (- \infty, \infty)$ $max n < (u - 1) / 2$
$M_{n}^{(u, v)} (x)$	$W (\begin{matrix} \begin{matrix} - u, & v \end{matrix} \\ \begin{matrix} 1, & 1, & 0 \end{matrix} \end{matrix}\| x) = x^{v} {(x + 1)}^{- (u + v)}$	$Finite, [0, \infty)$ $\begin{matrix} max n < (u - 1) / 2, \\ v > - 1 \end{matrix}$
$N_{n}^{(u)} (x)$	$W (\begin{matrix} \begin{matrix} - u, & 1 \end{matrix} \\ \begin{matrix} 1, & 0, & 0 \end{matrix} \end{matrix}\| x) = x^{- u} exp (- 1 / x)$	$Finite, [0, \infty)$ $max n < (u - 1) / 2$
$S_{n} (\begin{matrix} - 2 u - 2 v - 2, & 2 u \\ - 1, & 1 \end{matrix}\| x)$	$W^{*} (\begin{matrix} - 2 u - 2 v, & 2 u \\ - 1, & 1 \end{matrix}\| x) = x^{2 u} {(1 - x^{2})}^{v}$	$Infinite, [- 1, 1]$ $u > - 1 / 2, v > - 1$
$S_{n} (\begin{matrix} - 2, & 2 u \\ 0, & 1 \end{matrix}\| x)$	$W^{*} (\begin{matrix} - 2, & 2 u \\ 0, & 1 \end{matrix}\| x) = x^{2 u} exp (- x^{2})$	$Infinite, (- \infty, \infty)$ $u > - 1 / 2$
$S_{n} (\begin{matrix} - 2 u - 2 v + 2, & - 2 u \\ 1, & 1 \end{matrix}\| x)$	$W^{*} (\begin{matrix} - 2 u - 2 v, & - 2 u \\ 1, & 1 \end{matrix}\| x) = x^{- 2 u} {(1 + x^{2})}^{- v}$	$Finite, (- \infty, \infty)$ $\begin{matrix} n < u + v - 1 / 2, \\ u < 1 / 2, v > 0 \end{matrix}$
$S_{n} (\begin{matrix} - 2 u + 2, & 2 \\ 1, & 0 \end{matrix}\| x)$	$W^{*} (\begin{matrix} - 2 u, & 2 \\ 1, & 0 \end{matrix}\| x) = x^{- 2 u} \exp (- 1 / x^{2})$	$Finite, (- \infty, \infty)$ $max n < u - 1 / 2$

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Masjed-Jamei, M. An Improvement of Least Squares Theory: Theory of Least p-Variances Approximation and p-Uncorrelated Functions. Mathematics 2025, 13, 2255. https://doi.org/10.3390/math13142255

AMA Style

Masjed-Jamei M. An Improvement of Least Squares Theory: Theory of Least p-Variances Approximation and p-Uncorrelated Functions. Mathematics. 2025; 13(14):2255. https://doi.org/10.3390/math13142255

Chicago/Turabian Style

Masjed-Jamei, Mohammad. 2025. "An Improvement of Least Squares Theory: Theory of Least p-Variances Approximation and p-Uncorrelated Functions" Mathematics 13, no. 14: 2255. https://doi.org/10.3390/math13142255

APA Style

Masjed-Jamei, M. (2025). An Improvement of Least Squares Theory: Theory of Least p-Variances Approximation and p-Uncorrelated Functions. Mathematics, 13(14), 2255. https://doi.org/10.3390/math13142255

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

An Improvement of Least Squares Theory: Theory of Least p-Variances Approximation and p-Uncorrelated Functions

Abstract

1. Introduction

2. Least p-Variances Approximation Based on a Fixed Variable

2.1. Continuous Case of p-Covariances Linear System

2.2. Discrete Case of p-Covariances Linear System

2.3. Some Reducible Cases of the p-Covariances Linear System

2.3.1. First Case

2.3.2. Second Case

2.4. An Illustrative Example and the Role of the Fixed Variable in It

2.4.1. First Case

2.4.2. Second Case

2.5. How to Choose a Suitable Option for the Fixed Variable: A Geometric Interpretation

3. p-Uncorrelatedness Condition on the p-Covariances Linear System and Its Consequences

A General Representation for p-Uncorrelated Variables

4. p-Uncorrelated Expansions with Respect to a Fixed Variable

4.1. A Biorthogonality Property

5. p-Uncorrelated Functions with Respect to a Fixed Function

5.1. Two Classifications for Deriving p-Uncorrelated Functions

5.1.1. First Classification

5.1.2. Second Classification

5.2. A Generalized Theorem with an Important Remark

5.2.1. An Extension of the Previous Theorem

5.2.2. Remark

6. Simple p-Uncorrelated Variables and Their Relationship with Orthogonal Sequences

6.1. A Biorthogonality Property

6.2. A Basic Remark

6.3. Simple Uncorrelated Functions of Classical Type

6.3.1. New Classes of Orthogonal Functions

First Sequence

Second Sequence

7. p-Uncorrelated Polynomials Sequence (p-UPS)

7.1. A Generic Recurrence Relation for p-UPS

7.2. A Complete Uncorrelated Sequence of Hypergeometric Polynomials of F 3 4 Type

8. On the Ordinary Case of p-Covariances

Another Uncorrelated Sequence of Hypergeometric Polynomials of F 3 4 Type

9. A Class of Uncorrelated Polynomials Based on a Predetermined Orthogonal Polynomial

9.1. An Uncorrelated Sequence of Hypergeometric Polynomials of F 2 3 Type

Some Particular Trigonometric Cases

9.2. An Uncorrelated Sequence of Hypergeometric Polynomials of F 2 2 Type

10. A Unified Approach for the Polynomials Obtained in Section 7, Section 8 and Section 9

A Basic Example of Uncorrelated Hypergeometric Polynomials of F 3 4 Type

11. p-Uncorrelated Vectors with Respect to a Fixed Vector

12. An Upper Bound for 1-Covariances

13. An Improvement to the Approximate Solutions of Over-Determined Systems

14. An Improvement of the Bessel Inequality and Parseval Identity

14.1. First Type of Improvement

14.2. Second Type of Improvement

15. Least p-Variances with Respect to Fixed Orthogonal Variables

15.1. Least p-Variances Approximation Based on Fixed Orthogonal Variables

15.1.1. First Case

15.1.2. Second Case

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

7.2. A Complete Uncorrelated Sequence of Hypergeometric Polynomials of ${}_{4}F_{3}$ Type

Another Uncorrelated Sequence of Hypergeometric Polynomials of ${}_{4}F_{3}$ Type

9.1. An Uncorrelated Sequence of Hypergeometric Polynomials of ${}_{3}F_{2}$ Type

9.2. An Uncorrelated Sequence of Hypergeometric Polynomials of ${}_{2}F_{2}$ Type

A Basic Example of Uncorrelated Hypergeometric Polynomials of ${}_{4}F_{3}$ Type