Analysis and Conditional Optimization of Projection Estimates for Distribution of Random Variable Using Legendre Polynomials

Averina, Tatyana; Rybakov, Konstantin

doi:10.3390/a18080466

Open AccessArticle

Analysis and Conditional Optimization of Projection Estimates for Distribution of Random Variable Using Legendre Polynomials

by

Tatyana Averina

¹

and

Konstantin Rybakov

^2,*

¹

Institute of Computational Mathematics and Mathematical Geophysics, Siberian Branch of the Russian Academy of Sciences, 630090 Novosibirsk, Russia

²

Moscow Aviation Institute, National Research University, 125993 Moscow, Russia

^*

Author to whom correspondence should be addressed.

Algorithms 2025, 18(8), 466; https://doi.org/10.3390/a18080466

Submission received: 7 June 2025 / Revised: 13 July 2025 / Accepted: 23 July 2025 / Published: 26 July 2025

(This article belongs to the Section Randomized, Online, and Approximation Algorithms)

Download

Browse Figures

Versions Notes

Abstract

Algorithms for jointly obtaining projection estimates of the density and distribution function of a random variable using Legendre polynomials are proposed. For these algorithms, a problem of conditional optimization is solved. Such optimization allows one to increase the approximation accuracy with minimum computational costs. The proposed algorithms are tested on examples with different degrees of smoothness of the density. A projection estimate of the density is compared to a histogram that is often used in applications to estimate distributions.

Keywords:

Monte Carlo method; projection estimate; density; distribution function; Legendre polynomials; conditional optimization

Graphical Abstract

1. Introduction

Statistical projection estimates in the Monte Carlo method were first proposed by N.N. Chentsov [1]. He developed a general technique for optimizing such estimates; this technique, however, requires clarification in specific problems [2]. In the paper [3], projection estimates based on the Legendre polynomials for marginal probability densities of solutions to stochastic differential equations were proposed. The mean square error of the estimates was studied, and a comparison of both the obtained projection estimates and the histogram was carried out with examples. The analysis of the results showed that for the same sample size, the projection estimate approximates the density more accurately. In addition, such an estimate is specified analytically, and it is a smooth function. So, it is preferable in, e.g., filtering problems and the control of nonlinear systems [4,5,6,7].

This paper presents two statistical algorithms, based on algorithms using Legendre polynomials from [3], for jointly obtaining projection estimates of the density and distribution function of a random variable.

When solving different problems by statistical estimation algorithms, it is important to make an optimal (consistent) choice of their parameters for finding the mathematical expectation of a certain functional that depends on a random variable. Therefore, this paper analyzes the mean square error of the projection estimate from this point of view. Such parameters are the projection expansion length and sample size. We solve a conditional optimization problem considered by G.A. Mikhailov [8]. The objective of this problem is to minimize the algorithm complexity while achieving the required level of approximation accuracy. We study how to minimize the mean square error of the projection estimates of the density and distribution function by the equalization of its deterministic and stochastic components. The accuracy of the projection estimates also depends on the degree of smoothness of the density; therefore, in this paper, we consider the dependence of error not only on the projection expansion length but also on the degree of smoothness of the approximated function.

The obtained theoretical results are confirmed with examples using a two-parameter family of densities that allows one both to choose the degree of smoothness and to perform a simple calculation of the expansion coefficients with respect to Legendre polynomials.

The rest of this paper has the following structure: Section 2 contains the necessary information on the Fourier–Legendre series and the definition of projection estimates of the density and distribution function; in this section, relations for expansion coefficients of the density and distribution function with respect to Legendre polynomials are obtained. Section 3 presents algorithms for jointly obtaining randomized projection estimates of the density and distribution function. The analysis and conditional optimization of randomized projection estimates are carried out in Section 4. Section 5 proposes a two-parameter family of densities with different degrees of smoothness and related distribution functions; it presents an algorithm for modeling the corresponding random variables, gives the expansion coefficients of the considered densities and distribution functions, and studies the convergence rate of their expansions. Numerical experiments and their analysis are discussed in Section 6. In Section 7, a projection estimate of the density is compared with a histogram. Brief conclusions on the paper are given in Section 8.

2. Using Legendre Polynomials for Randomized Projection Estimates

2.1. Fourier–Legendre Series

The standardized Legendre polynomials

{P_{i}}_{i = 0}^{\infty}

are defined on the set

Ω = [- 1, 1]

as follows [9]:

P_{i} (x) = \frac{1}{2^{i} i!} \frac{d^{i} {(x^{2} - 1)}^{i}}{d x^{i}}, i = 0, 1, 2, \dots

They form a complete orthogonal system of functions in the space

L_{2} (Ω)

[10], where

L_{2} (Ω) = \{u : \int_{Ω} {| u (x) |}^{2} d x < \infty\},

and satisfy well-known recurrence relations:

\begin{matrix} (i + 1) P_{i + 1} (x) = (2 i + 1) x P_{i} (x) - i P_{i - 1} (x), i = 1, 2, \dots, \end{matrix}

(1)

\begin{matrix} (2 i + 1) P_{i} (x) = P_{i + 1}^{'} (x) - P_{i - 1}^{'} (x), i = 1, 2, \dots, \end{matrix}

(2)

where

P_{0} (x) = 1

and

P_{1} (x) = x

. These relations can be formally applied for

i = 0

when

P_{- 1} (x) = 0

.

The explicit formula for the standardized Legendre polynomials is

P_{i} (x) = \frac{1}{2^{i}} \sum_{k = 0}^{⌊ i / 2 ⌋} \frac{{(- 1)}^{k} (2 i - 2 k)!}{k! (i - k)! (i - 2 k)!} x^{i - 2 k}, i = 0, 1, 2, \dots,

(3)

where

⌊ \cdot ⌋

is the floor function. The explicit formulae can be specified separately for even indices (

i = 2 j

), as

P_{2 j} (x) = \frac{1}{2^{2 j}} \sum_{k = 0}^{j} \frac{{(- 1)}^{k} (4 j - 2 k)!}{k! (2 j - k)! (2 j - 2 k)!} x^{2 j - 2 k},

(4)

and for odd indices (

i = 2 j + 1

), as

P_{2 j + 1} (x) = \frac{1}{2^{2 j + 1}} \sum_{k = 0}^{j} \frac{{(- 1)}^{k} (4 j - 2 k + 2)!}{k! (2 j - k + 1)! (2 j - 2 k + 1)!} x^{2 j - 2 k + 1} .

(5)

Using the normalization, we obtain the Legendre polynomials

{{\hat{P}}_{i}}_{i = 0}^{\infty}

that form a complete orthonormal system of functions in

L_{2} (Ω)

:

{\hat{P}}_{i} (x) = \sqrt{\frac{2 i + 1}{2}} P_{i} (x), i = 0, 1, 2, \dots

(6)

The expansion of an arbitrary function

u \in L_{2} (Ω)

into the Fourier–Legendre series has the form

u (x) = \sum_{i = 0}^{\infty} U_{i} {\hat{P}}_{i} (x),

(7)

where coefficients

U_{i}

are defined as inner products, i.e.,

U_{i} = {(u, {\hat{P}}_{i})}_{L_{2} (Ω)} = \int_{Ω} u (x) {\hat{P}}_{i} (x) d x, i = 0, 1, 2, \dots

This expansion is a basis for the approximation of u:

u (x) \approx u^{〈 n 〉} (x) = \sum_{i = 0}^{n} U_{i} {\hat{P}}_{i} (x),

(8)

and the approximation accuracy is estimated depending on the chosen metric (or norm). For example, in

L_{2} (Ω)

, we have [11]

∥ u - u^{〈 n 〉} ∥_{L_{2} (Ω)} ⩽ \frac{C}{n^{s}},

(9)

where

C > 0

does not depend on n, under the condition that the generalized derivatives of u up to order s belong to

L_{2} (Ω)

. There is also the sharper estimate [12,13]

∥ u - u^{〈 n 〉} ∥_{L_{2} (Ω)} ⩽ \tilde{C} {(\frac{(n - s + 1)!}{(n + s + 1)!})}^{1 / 2},

(10)

for which it is assumed that

\tilde{C} > 0

does not depend on n, and the generalized derivatives of u up to order s belong to

L_{2} (Ω)

with weight

ρ (x) = {(1 - x^{2})}^{s}

,

s ⩽ n + 1

.

We also present the estimate corresponding to

C (Ω)

, the space of all continuous functions on

Ω

. If a function u is continuously differentiable r times on

Ω

, i.e.,

u \in C^{r} (Ω)

, and its derivative

u^{(r)}

satisfies the Lipschitz condition with parameter

α \in (0, 1]

, then

| u (x) - u^{〈 n 〉} (x) | ⩽ \frac{\hat{C}}{n^{r + α - 1 / 2}} \forall x \in Ω,

where

\hat{C} > 0

does not depend on n,

r + α > 1 / 2

[9].

Remark 1.

In inequality (9), the parameter s can be a real non-negative number [14]. In this case, we should use the condition

u \in W_{2}^{s} (Ω)

for

s ⩾ 0

instead of the condition that the generalized derivatives of u up to order s belong to

L_{2} (Ω)

. If s is a natural number, then

W_{2}^{s} (Ω)

, the Sobolev space, is defined in the usual way [10]:

W_{2}^{s} (Ω) = \{u : u^{(k)} \in L_{2} (Ω), k = 0, 1, \dots, s\} .

If s is not a natural number, then it is possible to define

W_{2}^{s} (Ω)

, the Sobolev–Slobodetskij space, by the complex interpolation between

W_{2}^{⌊ s ⌋} (Ω)

and

W_{2}^{⌈ s ⌉} (Ω)

[10,15], where

⌈ s ⌉ = ⌊ s ⌋ + 1

and

W_{2}^{0} (Ω) = L_{2} (Ω)

. However, we use another definition for

W_{2}^{s} (Ω)

from [15,16,17]:

W_{2}^{s} (Ω) = \{u \in L_{2} (Ω) : \int_{Ω^{2}} \frac{{| u (x) - u (y) |}^{2}}{{| x - y |}^{1 + 2 s}} d x d y < \infty\}, s = σ \in (0, 1),

(11)

and

W_{2}^{s} (Ω) = \{u \in W_{2}^{⌊ s ⌋} (Ω), u^{(⌊ s ⌋)} \in W_{2}^{σ} (Ω), σ = s - ⌊ s ⌋\}, s ⩾ 1 .

The parameter s in inequalities (9) and (10) may be related to the condition that derivatives of fractional order s for u belong to

L_{2} (Ω)

. For non-integer s, we should replace the ratio

(n - s + 1)! / (n + s + 1)!

with the ratio

Γ (n - s + 2) / Γ (n + s + 2)

in inequality (10), where Γ is the gamma function.

2.2. Projection Estimates of Density and Distribution Functions

Let

ξ

be the random variable with range

Ω = [- 1, 1]

, density g, and distribution function f. The function g equals to zero outside

Ω

, and the function f is expressed through g [18]:

f (x) = \int_{- \infty}^{x} g (θ) d θ .

(12)

Further, we consider projection estimates of

g, f \in L_{2} (Ω)

using their expansions into the Fourier–Legendre series (the restriction of the density and distribution function to

Ω

is implied). Note that the condition

f \in L_{2} (Ω)

follows from the condition

g \in L_{2} (Ω)

, which is assumed to be satisfied.

As a function u in the above formulae, we can use the density g and distribution function f (below, we denote the corresponding expansion coefficients by

G_{i}

and

F_{i}

). In this case, the sequence

f^{〈 n 〉}

converges to f faster than the sequence

g^{〈 n 〉}

converges to g, where

g^{〈 n 〉} (x) = \sum_{i = 0}^{n} G_{i} {\hat{P}}_{i} (x), f^{〈 n 〉} (x) = \sum_{i = 0}^{n} F_{i} {\hat{P}}_{i} (x),

(13)

since the degree of smoothness of f is greater by one than the degree of smoothness of g due to their relationship (12).

For example, let g be continuous on

Ω

, differentiable in a generalized sense only, i.e.,

s = 1

,

r = 0

, and satisfy the Lipschitz condition with parameter

α = 1

. Then,

\begin{matrix} ∥ g - g^{〈 n 〉} ∥_{L_{2} (Ω)} ⩽ \frac{C^{g}}{n} or {∥ g - g^{〈 n 〉} ∥}_{L_{2} (Ω)} ⩽ \frac{{\tilde{C}}^{g}}{\sqrt{(n + 1) (n + 2)}}, \\ | g (x) - g^{〈 n 〉} (x) | ⩽ \frac{{\hat{C}}^{g}}{n^{1 / 2}}, \end{matrix}

and for f, the following estimates hold (

s = 2

,

r = 1

,

α = 1

):

\begin{matrix} ∥ f - f^{〈 n 〉} ∥_{L_{2} (Ω)} ⩽ \frac{C^{f}}{n^{2}} or {∥ f - f^{〈 n 〉} ∥}_{L_{2} (Ω)} ⩽ \frac{{\tilde{C}}^{f}}{\sqrt{n (n + 1) (n + 2) (n + 3)}}, \\ | f (x) - f^{〈 n 〉} (x) | ⩽ \frac{{\hat{C}}^{f}}{n^{3 / 2}}, \end{matrix}

where

C^{g}, {\tilde{C}}^{g}, {\hat{C}}^{g}, C^{f}, {\tilde{C}}^{f}, {\hat{C}}^{f} > 0

are some constants and

x \in Ω

.

The projection estimates of the density and distribution function are given by expressions corresponding to their approximations (13):

{\bar{g}}^{〈 n 〉} (x) = \sum_{i = 0}^{n} {\bar{G}}_{i} {\hat{P}}_{i} (x), {\bar{f}}^{〈 n 〉} (x) = \sum_{i = 0}^{n} {\bar{F}}_{i} {\hat{P}}_{i} (x),

(14)

where

{\bar{G}}_{i}

and

{\bar{F}}_{i}

are estimates of expansion coefficients

G_{i}

and

F_{i}

based on observations of the random variable

ξ

[19,20],

i = 0, 1, \dots, n

.

2.3. Expansion Coefficients of Density and Distribution Function

The relation between expansion coefficients

G_{i}

and the statistical characteristics of the random variable

ξ

is established using a linear functional J defined by this density as

J φ = {(g, φ)}_{L_{2} (Ω)} = \int_{Ω} g (x) φ (x) d x = E φ (ξ),

(15)

where

E

means the mathematical expectation. In Formula (15), it is required to substitute the Legendre polynomials (6) instead of a function

φ

, then

G_{i} = E {\hat{P}}_{i} (ξ), i = 0, 1, 2, \dots

The series representation and the corresponding approximation by a partial sum of this series, similar to (7) and (8), can be obtained for an arbitrary orthonormal basis of

L_{2} (Ω)

. But the Legendre polynomials have an important advantage; namely, expansion coefficients

G_{i}

are easily expressed via initial moments of the random variable

ξ

:

M_{k} = E ξ^{k}, k = 1, 2, \dots

Indeed, using Formulae (3) and (6), we have

\begin{matrix} E {\hat{P}}_{i} (ξ) & = \frac{1}{2^{i}} \sqrt{\frac{2 i + 1}{2}} \sum_{k = 0}^{⌊ i / 2 ⌋} \frac{{(- 1)}^{k} (2 i - 2 k)!}{k! (i - k)! (i - 2 k)!} E ξ^{i - 2 k} \\ = \frac{1}{2^{i}} \sqrt{\frac{2 i + 1}{2}} \sum_{k = 0}^{⌊ i / 2 ⌋} \frac{{(- 1)}^{k} (2 i - 2 k)!}{k! (i - k)! (i - 2 k)!} M_{i - 2 k}, \end{matrix}

(16)

where

M_{0} = 1

.

Another advantage of the chosen basis is that the antiderivative of any Legendre polynomial with index

i > 0

is represented as a linear combination of two Legendre polynomials with neighboring indices due to recurrence relation (2) (for

i = 0

, one of the polynomials in a linear combination is also

{\hat{P}}_{0}

). Thus, having a projection estimate of the density, we can find a projection estimate of the distribution function by the simplest operations. The expressions required for this are obtained below.

By integrating the left-hand and right-hand sides of relation (2) and taking into account the property of the Legendre polynomials [9], namely, the equality

P_{i + 1} (- 1) = P_{i - 1} (- 1)

, we have

(2 i + 1) \int_{- 1}^{x} P_{i} (θ) d θ = P_{i + 1} (x) - P_{i - 1} (x), i = 1, 2, \dots,

(17)

or

\begin{matrix} \sqrt{\frac{2 i + 1}{2}} \int_{- 1}^{x} P_{i} (θ) d θ = \frac{1}{\sqrt{2 (2 i + 1)}} P_{i + 1} (x) - \frac{1}{\sqrt{2 (2 i + 1)}} P_{i - 1} (x) \\ = \frac{1}{\sqrt{(2 i + 1) (2 i + 3)}} \sqrt{\frac{2 i + 3}{2}} P_{i + 1} (x) - \frac{1}{\sqrt{(2 i - 1) (2 i + 1)}} \sqrt{\frac{2 i - 1}{2}} P_{i - 1} (x), \end{matrix}

i.e.,

\int_{- 1}^{x} {\hat{P}}_{i} (θ) d θ = \frac{1}{\sqrt{(2 i + 1) (2 i + 3)}} {\hat{P}}_{i + 1} (x) - \frac{1}{\sqrt{(2 i - 1) (2 i + 1)}} {\hat{P}}_{i - 1} (x), i = 1, 2, \dots

(18)

Given

i = 0

, we obtain

\int_{- 1}^{x} P_{0} (θ) d θ = \int_{- 1}^{x} d θ = x + 1 = P_{1} (x) + P_{0} (x),

or

\frac{1}{\sqrt{2}} \int_{- 1}^{x} P_{0} (θ) d θ = \frac{1}{\sqrt{2}} P_{1} (x) + \frac{1}{\sqrt{2}} P_{0} (x) = \frac{1}{\sqrt{3}} \sqrt{\frac{3}{2}} P_{1} (x) + \frac{1}{\sqrt{2}} P_{0} (x),

i.e.,

\int_{- 1}^{x} {\hat{P}}_{0} (θ) d θ = \frac{1}{\sqrt{3}} {\hat{P}}_{1} (x) + {\hat{P}}_{0} (x) .

(19)

Relationship (12) between functions g and f, as well as recurrent relations (18) and (19), implies that

F_{0} = G_{0} - \frac{G_{1}}{\sqrt{3}}, F_{i} = \frac{G_{i - 1}}{\sqrt{(2 i - 1) (2 i + 1)}} - \frac{G_{i + 1}}{\sqrt{(2 i + 1) (2 i + 3)}}, i = 1, 2, \dots

(20)

Remark 2.

The condition

Ω = [- 1, 1]

does not reduce the generality of the above reasoning. Similar relations can be derived if the range of the random variable ξ is

Ω = [a, b]

and

g, f \in L_{2} (Ω)

. In this case, the orthonormal Legendre polynomials

{{\hat{P}}_{i}^{*}}_{i = 0}^{\infty}

are defined by the expression

{\hat{P}}_{i}^{*} (x) = \sqrt{\frac{2 i + 1}{b - a}} P_{i} (\frac{2 x - (b + a)}{b - a}) .

3. Algorithms for Jointly Obtaining Randomized Projection Estimates of Density and Distribution Functions

The randomization of the projection estimate of the density g is obtained by the first equation of Formula (14) as a result of calculating linear functional (15) for the Legendre polynomials (6) by the Monte Carlo method with independent realizations of the random variable

ξ

:

{\bar{G}}_{i} = \frac{1}{N} \sum_{l = 1}^{N} {\hat{P}}_{i} (ξ_{l}), i = 0, 1, 2, \dots,

(21)

where

ξ_{l}

is the lth realization and N is the sample size (number of realizations).

Estimates

{\bar{G}}_{i}

can also be obtained from expression (16) as

{\bar{G}}_{i} = \frac{1}{2^{i}} \sqrt{\frac{2 i + 1}{2}} \sum_{k = 0}^{⌊ i / 2 ⌋} \frac{{(- 1)}^{k} (2 i - 2 k)!}{k! (i - k)! (i - 2 k)!} {\bar{M}}_{i - 2 k},

(22)

where

{\bar{M}}_{k}

are statistical estimates of initial moments of the random variable

ξ

,

{\bar{M}}_{k} = \frac{1}{N} \sum_{l = 1}^{N} ξ_{l}^{k}, k = 1, 2, \dots,

(23)

and

{\bar{M}}_{0} = 1

. Using Formulae (4) and (5), we can write the expressions for the estimates of the density expansion coefficients separately for even indices (

i = 2 j

), as

{\bar{G}}_{2 j} = \frac{1}{2^{2 j}} \sqrt{\frac{4 j + 1}{2}} \sum_{k = 0}^{j} \frac{{(- 1)}^{k} (4 j - 2 k)!}{k! (2 j - k)! (2 j - 2 k)!} {\bar{M}}_{2 j - 2 k},

(24)

and for odd indices (

i = 2 j + 1

), as

{\bar{G}}_{2 j + 1} = \frac{1}{2^{2 j + 1}} \sqrt{\frac{4 j + 3}{2}} \sum_{k = 0}^{j} \frac{{(- 1)}^{k} (4 j - 2 k + 2)!}{k! (2 j - k + 1)! (2 j - 2 k + 1)!} {\bar{M}}_{2 j - 2 k + 1} .

(25)

The randomization of the projection estimate of the distribution function f is obtained by the second equation of Formula (14) and recurrence relations (20):

{\bar{F}}_{0} = {\bar{G}}_{0} - \frac{{\bar{G}}_{1}}{\sqrt{3}}, {\bar{F}}_{i} = \frac{{\bar{G}}_{i - 1}}{\sqrt{(2 i - 1) (2 i + 1)}} - \frac{{\bar{G}}_{i + 1}}{\sqrt{(2 i + 1) (2 i + 3)}}, i = 1, 2, \dots

(26)

Here and below, the dependence of the estimates of both expansion coefficients and functions on the sample size N is not indicated for simplicity.

Remark 3.

To obtain a projection estimate of the distribution function f based on the first

n + 1

Legendre polynomials, it is necessary to find estimates of the expansion coefficients of the density g with respect to the first

n + 2

Legendre polynomials.

Further, we present Algorithm 1 for jointly obtaining randomized projection estimates of the density and distribution function of the random variable

ξ

.

Algorithm 1: Jointly obtaining projection estimates of the density and distribution function using explicit formulae for the Legendre polynomials.

0. Set the projection expansion length n and the sample size N.
1. Simulate N realizations

ξ_{l}

of the random variable

ξ

,

l = 1, 2, \dots, N

.
2. Find statistical estimates

{\bar{M}}_{k}

of initial moments of the random variable

ξ

using Formula (23),

k = 1, 2, \dots, n + 1

. Set

{\bar{M}}_{0} = 1

. 3. Find estimates of expansion coefficients

{\bar{G}}_{i}

of the density g using Formula (22) or Formulae (24) and (25),

i = 0, 1, \dots, n + 1

.
4. Find estimates of expansion coefficients

{\bar{F}}_{i}

of the distribution function f using Formula (26),

i = 0, 1, \dots, n

.
5. Find randomized projection estimates

{\bar{g}}^{〈 n 〉}

and

{\bar{f}}^{〈 n 〉}

of the density and distribution function, respectively, using Formula (14).

In step 3 in Algorithm 1, errors can occur due to the peculiarities of machine arithmetic when calculating expansion coefficients

{\bar{G}}_{i}

. To avoid this, it is recommended to use Formula (21) together with recurrence relation (1) and expression (6).

Next, we formulate Algorithm 2 for jointly obtaining randomized projection estimates of the density and distribution function.

Algorithm 2: Jointly obtaining projection estimates of the density and distribution function using the recurrence relation for the Legendre polynomials.

0. Set the projection expansion length n and the sample size N.
1. Simulate N realizations

ξ_{l}

of the random variable

ξ

,

l = 1, 2, \dots, N

.
2. Find estimates of expansion coefficients

{\bar{G}}_{i}

of the density g using Formula (21) together with recurrence relation (1) and expression (6),

i = 0, 1, \dots, n + 1

.
3. Find estimates of expansion coefficients

{\bar{F}}_{i}

of the distribution function f using Formula (26),

i = 0, 1, \dots, n

.
4. Find randomized projection estimates

{\bar{g}}^{〈 n 〉}

and

{\bar{f}}^{〈 n 〉}

of the density and distribution function, respectively, using Formula (14).

4. Analysis and Conditional Optimization of Randomized Projection Estimates

In this section, we analyze the error of the projection estimate

{\bar{g}}^{〈 n 〉}

relative to the density g of the random variable

ξ

in

L_{2} (Ω)

. Let

B (g, {\bar{g}}^{〈 n 〉}) = E {∥ g - {\bar{g}}^{〈 n 〉} ∥}_{L_{2} (Ω)}

; then,

B^{2} (g, {\bar{g}}^{〈 n 〉}) = {(E ∥ g - {\bar{g}}^{〈 n 〉} ∥_{L_{2} (Ω)})}^{2} ⩽ E \int_{Ω} {(g (x) - {\bar{g}}^{〈 n 〉} (x))}^{2} d x

due to Jensen’s inequality [21]. Further, we consider the following expression:

\int_{Ω} {(g (x) - {\bar{g}}^{〈 n 〉} (x))}^{2} d x = ∥ g - {\bar{g}}^{〈 n 〉} ∥_{L_{2} (Ω)}^{2} = {∥ g - g^{〈 n 〉} + g^{〈 n 〉} - {\bar{g}}^{〈 n 〉} ∥}_{L_{2} (Ω)}^{2} .

Functions

g - g^{〈 n 〉}

and

g^{〈 n 〉} - {\bar{g}}^{〈 n 〉}

are orthogonal in

L_{2} (Ω)

since they belong to different subspaces. The first one is formed by the Legendre polynomials

{\hat{P}}_{i}

for

i = n + 1, n + 2, \dots

, and the second one is formed by the Legendre polynomials

{\hat{P}}_{i}

for

i = 0, 1, \dots, n

. This is a consequence of the equalities

g (x) = \sum_{i = 0}^{\infty} G_{i} {\hat{P}}_{i} (x), g^{〈 n 〉} (x) = \sum_{i = 0}^{n} G_{i} {\hat{P}}_{i} (x), {\bar{g}}^{〈 n 〉} (x) = \sum_{i = 0}^{n} {\bar{G}}_{i} {\hat{P}}_{i} (x);

hence,

∥ g - g^{〈 n 〉} + g^{〈 n 〉} - {\bar{g}}^{〈 n 〉} ∥_{L_{2} (Ω)}^{2} = ∥ g - g^{〈 n 〉} ∥_{L_{2} (Ω)}^{2} + {∥ g^{〈 n 〉} - {\bar{g}}^{〈 n 〉} ∥}_{L_{2} (Ω)}^{2} .

According to Parseval’s identity [9], we have

∥ g^{〈 n 〉} - {\bar{g}}^{〈 n 〉} ∥_{L_{2} (Ω)}^{2} = ∥ \sum_{i = 0}^{n} (G_{i} - {\bar{G}}_{i}) {\hat{P}}_{i} (x) ∥_{L_{2} (Ω)}^{2} = \sum_{i = 0}^{n} {(G_{i} - {\bar{G}}_{i})}^{2}

and

\int_{Ω} {(g (x) - {\bar{g}}^{〈 n 〉} (x))}^{2} d x = \sum_{i = 0}^{n} {(G_{i} - {\bar{G}}_{i})}^{2} + {∥ g - g^{〈 n 〉} ∥}_{L_{2} (Ω)}^{2} .

Therefore, taking into account that the unbiased estimate of mathematical expectation is used [18], namely,

E {\bar{G}}_{i} = G_{i}

, we obtain

\begin{matrix} B^{2} (g, {\bar{g}}^{〈 n 〉}) & ⩽ E (\sum_{i = 0}^{n} {(G_{i} - {\bar{G}}_{i})}^{2} + {∥ g - g^{〈 n 〉} ∥}_{L_{2} (Ω)}^{2}) \\ = \sum_{i = 0}^{n} E {(G_{i} - {\bar{G}}_{i})}^{2} + {∥ g - g^{〈 n 〉} ∥}_{L_{2} (Ω)}^{2} \\ = \sum_{i = 0}^{n} D {\bar{G}}_{i} + {∥ g - g^{〈 n 〉} ∥}_{L_{2} (Ω)}^{2}, \end{matrix}

(27)

where

D

denotes the variance.

The equality

{\bar{G}}_{0} = G_{0}

is satisfied since

{\hat{P}}_{0} (x) = const

, i.e.,

D {\bar{G}}_{0} = 0

. For

i > 0

, the variance of estimates

{\bar{G}}_{i}

and the sample size N are inversely proportional [22]; therefore, using additionally inequality (9) under the condition

g \in W_{2}^{s} (Ω)

(see Remark 1), we find

B^{2} (g, {\bar{g}}^{〈 n 〉}) ⩽ \frac{C_{1}^{g} n}{N} + \frac{C_{2}^{g}}{n^{2 s}},

(28)

where

C_{1}^{g}, C_{2}^{g} > 0

are constants independent of n and N.

The error of the projection estimate

{\bar{f}}^{〈 n 〉}

relative to the distribution function f of the random variable

ξ

in

L_{2} (Ω)

is analyzed similarly:

B^{2} (f, {\bar{f}}^{〈 n 〉}) ⩽ \frac{C_{1}^{f} n}{N} + \frac{C_{2}^{f}}{n^{2 s + 2}},

(29)

where

C_{1}^{f}, C_{2}^{f} > 0

are constants independent of n and N.

From the obtained estimates, it is clear how the mean square error depends on the projection expansion length n and the sample size N.

Further, we consider a problem of the optimal (consistent) choice of parameters for statistical algorithms to obtain projection estimates of the density and distribution function: the projection expansion length n and the sample size N. For jointly obtaining these estimates by Algorithms 1 and 2, it is sufficient to consider the optimal choice of parameters for density estimation only and use them for distribution function estimation. This is because the degree of smoothness of f is greater than the degree of smoothness of g, so the sequence

{\bar{f}}^{〈 n 〉}

converges to f faster than the sequence

{\bar{g}}^{〈 n 〉}

converges to g.

The main result is stated using the symbol “≍”. The expression

u (n) ≍ v (n)

for suitable functions u and v means that

u (n) = O (v (n))

and

v (n) = O (u (n))

for

n \to \infty

; i.e., there exist constants

C_{*}, C^{*} > 0

such that

C_{*} | v (n) | ⩽ | u (n) | ⩽ C^{*} | v (n) |

\forall n > n^{*}

, where n and

n^{*}

are natural numbers.

Theorem 1.

Let the density g of the random variable ξ belong to

W_{2}^{s} (Ω)

, and let

{\bar{g}}^{〈 n 〉}

be the randomized projection estimate of the density obtained by Algorithms 1 or 2 with the projection expansion length n and the sample size N. Then, the minimum complexity of obtaining the estimate

{\bar{g}}^{〈 n 〉}

is achieved with the parameters

n = n_{opt}

and

N = N_{opt}

that satisfy relations

N_{opt} ≍ n_{opt}^{2 s + 1}, N_{opt} ≍ γ^{- (2 s + 1) / s}, n_{opt} ≍ γ^{- 1 / s},

where γ is the required approximation accuracy for the density g in

L_{2} (Ω)

, i.e.,

B (g, {\bar{g}}^{〈 n 〉}) ⩽ γ

.

Proof.

To find the optimal parameters

n_{opt}

and

N_{opt}

for the estimate

{\bar{g}}^{〈 n 〉}

, it is sufficient to equate terms [8] in Formula (28) and express the required approximation accuracy

γ

from the relation

\frac{C_{1}^{g} n}{N} + \frac{C_{2}^{g}}{n^{2 s}} = γ^{2} .

From the equality

\frac{C_{1}^{g} n}{N} = \frac{C_{2}^{g}}{n^{2 s}},

we obtain the relationship for the optimal parameters, i.e.,

N_{opt} ≍ n_{opt}^{2 s + 1}

, as well as expressions

\frac{C_{1}^{g} n}{N} = \frac{γ^{2}}{2} and \frac{C_{2}^{g}}{n^{2 s}} = \frac{γ^{2}}{2},

and consequently,

N_{opt} ≍ γ^{- (2 s + 1) / s}

and

n_{opt} ≍ γ^{- 1 / s}

. □

Theorem 1 establishes the relationship between parameters in Algorithms 1 and 2, n and N, as well as the dependence of approximation accuracy on parameter s. By choosing parameters in this way, we have

B (g, {\bar{g}}^{〈 n 〉}) ⩽ c N^{- s / (2 s + 1)},

where

c > 0

is a constant independent of n and N.

Remark 4.

The error (28) of the randomized projection estimate

{\bar{g}}^{〈 n 〉}

relative to the density g is based on inequality (9), but inequality (10) can also be used. Then, taking into account Remark 1, we have

B^{2} (g, {\bar{g}}^{〈 n 〉}) ⩽ \frac{C_{1}^{g} n}{N} + C_{2}^{g} \frac{Γ (n - s + 2)}{Γ (n + s + 2)} .

It is possible to formulate an analogue of Theorem 1 and show that the following relationship between parameters for estimating the density g is conditionally optimal:

N_{opt} ≍ n_{opt} \frac{Γ (n_{opt} + s + 2)}{Γ (n_{opt} - s + 2)} .

Remark 5.

If the distribution function f of the random variable ξ belongs to

W_{2}^{s + 1} (Ω)

(this condition holds if the density g of the random variable ξ belongs to

W_{2}^{s} (Ω)

, i.e., if the conditions of Theorem 1 are satisfied), then the minimum complexity of obtaining randomized projection estimate

{\bar{f}}^{〈 n 〉}

by Algorithms 1 or 2 is achieved with parameters

n = n_{opt}

and

N = N_{opt}

that satisfy relations

N_{opt} ≍ n_{opt}^{2 s + 3}, N_{opt} ≍ γ^{- (2 s + 3) / (s + 1)}, n_{opt} ≍ γ^{- 1 / (s + 1)},

where γ is the required approximation accuracy for the function f in

L_{2} (Ω)

, i.e.,

B (f, {\bar{f}}^{〈 n 〉}) ⩽ γ

.

The proof is similar to the proof of Theorem 1. Another relationship between parameters can be found if we take inequality (10) instead of inequality (9) (see Remark 4).

5. Two-Parameter Family of Densities with Different Degrees of Smoothness

In this section, a special example is proposed to test statistical algorithms for obtaining projection estimates depending on the projection expansion length, sample size, and smoothness of the estimated function.

5.1. Densities with Different Degrees of Smoothness and Related Distribution Functions

Let

ξ

be the random variable defined by the density

g_{ν_{1}, ν_{2}}

:

g_{ν_{1}, ν_{2}} (x) = γ_{ν_{1}, ν_{2}} \{\begin{matrix} 1 - {(- x)}^{ν_{1}} & for x \in [- 1, 0), \\ 1 - x^{ν_{2}} & for x \in [0, 1], \\ 0 & otherwise, \end{matrix}

(30)

where

ν_{1}

and

ν_{2}

are parameters (natural numbers) and

γ_{ν_{1}, ν_{2}}

is a normalizing constant, with

\frac{1}{γ_{ν_{1}, ν_{2}}} = \int_{- 1}^{0} (1 - {(- x)}^{ν_{1}}) d x + \int_{0}^{1} (1 - x^{ν_{2}}) d x = \frac{ν_{1}}{ν_{1} + 1} + \frac{ν_{2}}{ν_{2} + 1},

because

\int_{- 1}^{0} (1 - {(- x)}^{ν}) d x = \int_{0}^{1} (1 - x^{ν}) d x = (x - \frac{x^{ν + 1}}{ν + 1}) |_{0}^{1} = 1 - \frac{1}{ν + 1} = \frac{ν}{ν + 1} .

Further, we assume that

ν_{1} \neq ν_{2}

. The function

g_{ν_{1}, ν_{2}}

has the following properties:

(a)

g_{ν_{1}, ν_{2}}

is continuous on the set of real numbers;

(b) The support of

g_{ν_{1}, ν_{2}}

is the set

Ω = [- 1, 1]

;

(c) The normalization condition holds:

\int_{- \infty}^{+ \infty} g_{ν_{1}, ν_{2}} (x) d x = \int_{- 1}^{1} g_{ν_{1}, ν_{2}} (x) d x = 1;

(d)

g_{ν_{1}, ν_{2}}

is differentiable at

x = 0

only r times:

g_{ν_{1}, ν_{2}} \in C^{r} (Ω), r = min {ν_{1}, ν_{2}} - 1;

(e) The derivative

g_{ν_{1}, ν_{2}}^{(r + 1)}

exists almost everywhere on

Ω

; if the derivative is understood in a generalized sense, then

g_{ν_{1}, ν_{2}}^{(s)} \in L_{2} (Ω), g_{ν_{1}, ν_{2}} \in W_{2}^{s} (Ω), s = r + 1 = min {ν_{1}, ν_{2}} .

Next, we determine the distribution function

f_{ν_{1}, ν_{2}}

for the random variable

ξ

with density (30):

f_{ν_{1}, ν_{2}} (x) = \int_{- \infty}^{x} g_{ν_{1}, ν_{2}} (θ) d θ .

(31)

It is easy to see that

f_{ν_{1}, ν_{2}} (x) = 0

for

x < - 1

and

f_{ν_{1}, ν_{2}} (x) = 1

for

x > 1

. Moreover,

\begin{matrix} \int_{- 1}^{x} (1 - {(- θ)}^{ν}) d θ = (θ + {(- 1)}^{ν + 1} \frac{θ^{ν + 1}}{ν + 1}) |_{- 1}^{x} = \frac{ν}{ν + 1} + x + \frac{{(- x)}^{ν + 1}}{ν + 1}, x \in [- 1, 0], \\ \int_{0}^{x} (1 - θ^{ν}) d θ = (θ - \frac{θ^{ν + 1}}{ν + 1}) |_{0}^{x} = x - \frac{x^{ν + 1}}{ν + 1}, x \in [0, 1], \end{matrix}

and consequently,

f_{ν_{1}, ν_{2}} (x) = \{\begin{matrix} 0 & for x < - 1, \\ γ_{ν_{1}, ν_{2}} (\frac{ν_{1}}{ν_{1} + 1} + x + \frac{{(- x)}^{ν_{1} + 1}}{ν_{1} + 1}) & for x \in [- 1, 0), \\ γ_{ν_{1}, ν_{2}} (\frac{ν_{1}}{ν_{1} + 1} + x - \frac{x^{ν_{2} + 1}}{ν_{2} + 1}) & for x \in [0, 1], \\ 1 & for x > 1 . \end{matrix}

(32)

The function

f_{ν_{1}, ν_{2}}

is differentiable

r + 1

times at

x = 0

since

f_{ν_{1}, ν_{2}}^{(r + 1)} = g_{ν_{1}, ν_{2}}^{(r)}

due to relationship (31). Thus,

f_{ν_{1}, ν_{2}} \in C^{r + 1} (Ω), f_{ν_{1}, ν_{2}}^{(s + 1)} \in L_{2} (Ω), f_{ν_{1}, ν_{2}} \in W_{2}^{s + 1} (Ω) .

If we do not restrict ourselves to the space

W_{2}^{s} (Ω)

with natural s (see Remark 1), then we can show that

s < min {ν_{1}, ν_{2}} + 1 / 2

and

sup s = min {ν_{1}, ν_{2}} + 1 / 2

.

Consider the case

ν_{1} > ν_{2}

. Then, the generalized derivative

g_{ν_{1}, ν_{2}}^{(ν_{2})}

is represented as a linear combination of functions

η (x)

and

x^{ν_{1} - ν_{2}} η (- x)

, where

η

is the indicator of

(0, \infty)

. It suffices to find a condition for parameter

σ

which ensures that

η \in W_{2}^{σ} (Ω)

, where

W_{2}^{σ} (Ω)

is defined by Formula (11) and

σ \in (0, 1)

.

If x and y have the same sign, then

| η (x) - η (y) | = 0

, and if x and y have different signs, then

| η (x) - η (y) | = 1

. Hence,

\int_{Ω^{2}} \frac{{| η (x) - η (y) |}^{2}}{{| x - y |}^{1 + 2 σ}} d x d y = \int_{- 1}^{0} [\int_{0}^{1} \frac{d y}{{| x - y |}^{1 + 2 σ}}] d x + \int_{0}^{1} [\int_{- 1}^{0} \frac{d y}{{| x - y |}^{1 + 2 σ}}] d x .

The integrals on the right-hand side of the latter equality coincide since the integrand does not change when the signs of x and y change simultaneously. The convergence condition for them is

σ < 1 / 2

. Indeed, for

σ < 1 / 2

, we have

\begin{matrix} \int_{- 1}^{0} [\int_{0}^{1} \frac{d y}{{| x - y |}^{1 + 2 σ}}] d x = \int_{- 1}^{0} [\int_{0}^{1} \frac{d y}{{(y - x)}^{1 + 2 σ}}] d x = - \frac{1}{2 σ} \int_{- 1}^{0} \frac{1}{{(y - x)}^{2 σ}} |_{0}^{1} d x \\ = - \frac{1}{2 σ} \int_{- 1}^{0} [\frac{1}{{(1 - x)}^{2 σ}} - \frac{1}{{(- x)}^{2 σ}}] d x = \frac{1}{2 σ (2 σ - 1)} [- \frac{1}{{(1 - x)}^{2 σ - 1}} |_{- 1}^{0} + \frac{1}{{(- x)}^{2 σ - 1}} |_{- 1}^{0}] \\ = \frac{1}{2 σ (2 σ - 1)} [\frac{1}{2^{2 σ - 1}} - 2] = \frac{2^{- 2 σ} - 1}{σ (2 σ - 1)} \end{matrix}

and

\int_{Ω^{2}} \frac{{| η (x) - η (y) |}^{2}}{{| x - y |}^{1 + 2 σ}} d x d y = \frac{2 (2^{- 2 σ} - 1)}{σ (2 σ - 1)} .

For

σ ⩾ 1 / 2

, these integrals obviously diverge. Therefore,

g_{ν_{1}, ν_{2}}^{(ν_{2})} \in W_{2}^{σ} (Ω)

for

σ < 1 / 2

, and

g_{ν_{1}, ν_{2}} \in W_{2}^{s} (Ω)

for

s < ν_{2} + 1 / 2

.

The case

ν_{1} < ν_{2}

is similar to the considered case, so finally,

g_{ν_{1}, ν_{2}} \in W_{2}^{s} (Ω)

and

f_{ν_{1}, ν_{2}} \in W_{2}^{s + 1} (Ω)

, provided that

s < min {ν_{1}, ν_{2}} + 1 / 2

and

sup s = min {ν_{1}, ν_{2}} + 1 / 2

.

5.2. Modeling Random Variables with Given Test Distributions Using Monte Carlo Method

The modeling formula for the random variable

ξ

with distribution function

f_{ν_{1}, ν_{2}}

for parameters

ν_{1}

and

ν_{2}

can be derived using the inverse function method [22]:

ξ = f_{ν_{1}, ν_{2}}^{- 1} (α)

, where

f_{ν_{1}, ν_{2}}^{- 1}

is the inverse function to

f_{ν_{1}, ν_{2}}

and

α

is the random variable having a uniform distribution on

(0, 1)

.

Given the distribution function

f_{ν_{1}, ν_{2}}

, we can obtain the Algorithm 3 for modeling the random variable

ξ

.

Algorithm 3: Modeling the random variable with given test density and distribution function.

0. Set parameters

ν_{1}

,

ν_{2}

, calculate

γ_{ν_{1}, ν_{2}}

and

f_{ν_{1}, ν_{2}} (0)

:

γ_{ν_{1}, ν_{2}} = {(\frac{ν_{1}}{ν_{1} + 1} + \frac{ν_{2}}{ν_{2} + 1})}^{- 1}, f_{ν_{1}, ν_{2}} (0) = γ_{ν_{1}, ν_{2}} \frac{ν_{1}}{ν_{1} + 1} = \frac{ν_{1} (ν_{2} + 1)}{ν_{1} + 2 ν_{1} ν_{2} + ν_{2}} .

1. Obtain a realization of the random variable

α

having a uniform distribution on

(0, 1)

.

2. If

α < f_{ν_{1}, ν_{2}} (0)

, then

ξ

is a root of the algebraic equation

\frac{ν_{1}}{ν_{1} + 1} + x + \frac{{(- x)}^{ν_{1} + 1}}{ν_{1} + 1} - \frac{α}{γ_{ν_{1}, ν_{2}}} = 0

(33)

from

(- 1, 0)

, otherwise

ξ

is a root of the algebraic equation

\frac{ν_{1}}{ν_{1} + 1} + x - \frac{x^{ν_{2} + 1}}{ν_{2} + 1} - \frac{α}{γ_{ν_{1}, ν_{2}}} = 0

(34)

from

[0, 1)

.

For small

ν_{1}

and

ν_{2}

, roots can be found analytically. Next, we obtain the modeling formulae for two cases used below.

Proposition 1.

For the random variable ξ with density

g (x) = \frac{6}{7} \{\begin{matrix} 1 + x & for x \in [- 1, 0), \\ 1 - x^{2} & for x \in [0, 1], \\ 0 & otherwise, \end{matrix}

the modeling formula is as follows:

ξ = \{\begin{matrix} \sqrt{\frac{7 α}{3}} - 1 & for α < \frac{3}{7}, \\ - Re A + \sqrt{3} Im A & for α ⩾ \frac{3}{7}, \end{matrix}

(35)

where

A = \sqrt[3]{\frac{3 - 7 α}{4} + \sqrt{\frac{{(7 α - 3)}^{2}}{16} - 1}} .

Proof.

The density g of the random variable

ξ

is included in a two-parameter family (30) for

ν_{1} = ν

,

ν_{2} = ν + 1

, and

ν = 1

:

g = g_{1, 2}

. For given parameters, we have

\frac{ν_{1}}{ν_{1} + 1} = \frac{1}{2}, γ_{1, 2} = \frac{6}{7}, f_{1, 2} (0) = \frac{3}{7} .

First, we consider the case

α < f_{1, 2} (0) = 3 / 7

; i.e., we should solve Equation (33). This is the quadratic equation

\frac{1}{2} + x + \frac{x^{2}}{2} - \frac{7 α}{6} = 0, or x^{2} + b x + c = 0,

where

b = 2

,

c = 1 - 7 α / 3 > 0

.

The function

v (x) = x^{2} + 2 x + 1 - 7 α / 3

has a minimum at

x^{*} = - 1

since

v^{'} (x^{*}) = 2 x^{*} + 2 = 0

,

v^{″} (x^{*}) = 2 > 0

and

v (x^{*}) = - 7 α / 3 < 0

. This means that the quadratic equation has two real roots (the discriminant is positive) and the largest of them determines

ξ

:

ξ = max {x_{1}, x_{2}} > x^{*}, x_{1, 2} = - \frac{b}{2} \pm \sqrt{\frac{b^{2}}{4} - c} = - 1 \pm \sqrt{\frac{7 α}{3}}, ξ = \sqrt{\frac{7 α}{3}} - 1 \in (- 1, 0) .

Second, we consider the case

α ⩾ f_{1, 2} (0) = 3 / 7

; i.e., we should solve Equation (34). This is the cubic equation

\frac{1}{2} + x - \frac{x^{3}}{3} - \frac{7 α}{6} = 0, or x^{3} + p x + q = 0,

where

p = - 3

,

q = (7 α - 3) / 2 > 0

.

The function

w (x) = x^{3} - 3 x + (7 α - 3) / 2

has extrema at

x_{1, 2}^{*} = \pm 1

:

w^{'} (x^{*}) = 3 {(x_{1, 2}^{*})}^{2} - 3 = 0

. A minimum is reached at

x_{1}^{*} = 1

:

w^{″} (x_{1}^{*}) = 6 x_{1}^{*} = 6 > 0

and

w (x_{1}^{*}) = 7 (α - 1) / 2 < 0

. A maximum is reached at

x_{2}^{*} = - 1

:

w^{″} (x_{2}^{*}) = 6 x_{2}^{*} = - 6 < 0

and

w (x_{2}^{*}) = (1 + 7 α) / 2 > 0

. This means that the cubic equation has three real roots (the discriminant is positive) and

ξ

is determined by the root that lies between the smallest and the largest roots. By using Cardano’s formulae for roots [23], we have

x_{1} = A + B, x_{2, 3} = - \frac{A + B}{2} \pm \frac{A - B}{2} \sqrt{3} i,

where

A = \sqrt[3]{- \frac{q}{2} + \sqrt{Q}}, B = \sqrt[3]{- \frac{q}{2} - \sqrt{Q}}, Q = {(\frac{p}{3})}^{3} + {(\frac{q}{2})}^{2},

and

Q < 0

, which corresponds to the positive discriminant; therefore, A and B are complex conjugate numbers:

\frac{A + B}{2} = Re A, \frac{A - B}{2 i} = Im A and x_{1} = 2 Re A, x_{2, 3} = - Re A \mp \sqrt{3} Im A .

Let

z = - q / 2 + \sqrt{Q}

. Since

Re z = - q / 2 < 0

and

Im z = \sqrt{- Q} > 0

, we obtain

arg z \in (π / 2, π)

; therefore,

arg A \in (π / 6, π / 3)

and

Re A, Im A > 0, tan \frac{π}{6} < \frac{Im A}{Re A} < tan \frac{π}{3}, or Re A < \sqrt{3} Im A < 3 Re A .

Thus,

x_{2} < x_{3} < x_{1}, ξ = - \frac{A + B}{2} - \frac{A - B}{2} \sqrt{3} i = - Re A + \sqrt{3} Im A \in (0, 1),

i.e., Formula (35) is valid. □

Proposition 2.

For the random variable ξ with density

g (x) = \frac{12}{17} \{\begin{matrix} 1 + x^{3} & for x \in [- 1, 0), \\ 1 - x^{2} & for x \in [0, 1], \\ 0 & otherwise, \end{matrix}

the modeling formula is as follows:

ξ = \{\begin{matrix} - \sqrt{\frac{Y}{2}} + \sqrt{- \frac{Y}{2} + \sqrt{\frac{2}{Y}}} & for α < \frac{9}{17}, \\ - Re A + \sqrt{3} Im A & for α ⩾ \frac{9}{17}, \end{matrix}

(36)

where

Y = ω - \frac{17 α - 9}{9 ω}, ω = \sqrt[3]{1 + \sqrt{1 + \frac{{(17 α - 9)}^{3}}{729}}}, A = \sqrt[3]{\frac{9 - 17 α}{8} + \sqrt{\frac{{(17 α - 9)}^{2}}{64} - 1}} .

Proof.

The density g of the random variable

ξ

is included in a two-parameter family (30) for

ν_{1} = ν + 1

,

ν_{2} = ν

, and

ν = 2

:

g = g_{3, 2}

. In this case,

\frac{ν_{1}}{ν_{1} + 1} = \frac{3}{4}, γ_{3, 2} = \frac{12}{17}, f_{3, 2} (0) = \frac{9}{17} .

The proof is the same as for Proposition 1, so some details are omitted. We only note that for

α < f_{3, 2} (0) = 9 / 17

, Equation (33) is the quartic equation

\frac{3}{4} + x + \frac{x^{4}}{4} - \frac{17 α}{12} = 0, or x^{4} + b x + c = 0,

where

b = 4

,

c = 3 - 17 α / 3 > 0

, the polynomial

v (x) = x^{4} + 4 x + 3 - 17 α / 3

has a minimum at

x^{*} = - 1

, and

v (x^{*}) = - 17 α / 3 < 0

. The quartic equation has two real roots, and the largest of them determines

ξ

. It is convenient to find roots using Ferrari’s formulae [23].

For

α ⩾ f_{3, 2} (0) = 9 / 17

, Equation (34) is the cubic equation that is solved similarly to the cubic equation from the proof of Proposition 1. Such reasoning leads to Formula (36). □

5.3. Expansion Coefficients of Test Functions (Fourier–Legendre Series)

To exactly calculate the second term of the projection estimate error (27) in examples, we find expansion coefficients

G_{ν_{1}, ν_{2}, i}

of the density

g_{ν_{1}, ν_{2}}

, as well as expansion coefficients

F_{ν_{1}, ν_{2}, i}

of the distribution function

f_{ν_{1}, ν_{2}}

with respect to Legendre polynomials (6):

g_{ν_{1}, ν_{2}} (x) = \sum_{i = 0}^{\infty} G_{ν_{1}, ν_{2}, i} {\hat{P}}_{i} (x), f_{ν_{1}, ν_{2}} (x) = \sum_{i = 0}^{\infty} F_{ν_{1}, ν_{2}, i} {\hat{P}}_{i} (x) .

First, we obtain the following values:

Q_{ν, i}^{-} = \int_{- 1}^{0} x^{ν} P_{i} (x) d x, Q_{ν, i}^{+} = \int_{0}^{1} x^{ν} P_{i} (x) d x .

(37)

We multiply the left-hand and right-hand sides of relation (1) by

x^{ν - 1}

and integrate over the interval

[- 1, 0]

:

(i + 1) \int_{- 1}^{0} x^{ν - 1} P_{i + 1} (x) d x = (2 i + 1) \int_{- 1}^{0} x^{ν} P_{i} (x) d x - i \int_{- 1}^{0} x^{ν - 1} P_{i - 1} (x) d x,

or

(i + 1) Q_{ν - 1, i + 1}^{-} = (2 i + 1) Q_{ν, i}^{-} - i Q_{ν - 1, i - 1}^{-}, Q_{ν, i}^{-} = \frac{(i + 1) Q_{ν - 1, i + 1}^{-} + i Q_{ν - 1, i - 1}^{-}}{2 i + 1} .

(38)

Similarly, by multiplying the left-hand and right-hand sides of relation (1) by

x^{ν - 1}

and integrating over the interval

[0, 1]

, we obtain

Q_{ν, i}^{+} = \frac{(i + 1) Q_{ν - 1, i + 1}^{+} + i Q_{ν - 1, i - 1}^{+}}{2 i + 1} .

These relations can be formally applied for

i = 0

when

Q_{ν, - 1}^{-} = Q_{ν, - 1}^{+} = 0

but not for

ν = 0

. Therefore, we should consider the case

ν = 0

separately:

Q_{0, i}^{-} = \int_{- 1}^{0} P_{i} (x) d x, Q_{0, i}^{+} = \int_{0}^{1} P_{i} (x) d x .

For

i = 0

and

i = 1

, we have

Q_{0, 0}^{-} = \int_{- 1}^{0} P_{0} (x) d x = \int_{- 1}^{0} d x = 1, Q_{0, 1}^{-} = \int_{- 1}^{0} P_{1} (x) d x = \int_{- 1}^{0} x d x = - \frac{1}{2},

and then we use relation (17):

Q_{0, i}^{-} = \int_{- 1}^{0} P_{i} (x) d x = \frac{P_{i + 1} (0) - P_{i - 1} (0)}{2 i + 1}, i = 1, 2, \dots

If

i \neq 0

is even, then

P_{i + 1} (0) = P_{i - 1} (0) = 0

according to Formula (5), and

Q_{0, i}^{-} = 0

. If i is odd, then we can apply the explicit Formula (4) or obtain an additional recurrence relation. We choose the latter way and take into account relation (1) for

x = 0

:

(i + 1) P_{i + 1} (0) = - i P_{i - 1} (0) .

Then,

\begin{matrix} \frac{Q_{0, i}^{-}}{Q_{0, i - 2}^{-}} & = \frac{P_{i + 1} (0) - P_{i - 1} (0)}{2 i + 1} \frac{2 i - 3}{P_{i - 1} (0) - P_{i - 3} (0)} \\ = \frac{- \frac{i}{i + 1} P_{i - 1} (0) - P_{i - 1} (0)}{2 i + 1} \frac{2 i - 3}{P_{i - 1} (0) + \frac{i - 1}{i - 2} P_{i - 1} (0)} \\ = \frac{- \frac{i}{i + 1} - 1}{2 i + 1} \frac{2 i - 3}{1 + \frac{i - 1}{i - 2}}, \end{matrix}

i.e.,

Q_{0, i}^{-} = \frac{2 - i}{i + 1} Q_{0, i - 2}^{-}, i = 3, 5, 7, \dots

The same reasoning leads to the following results:

Q_{0, 0}^{+} = \int_{0}^{1} P_{0} (x) d x = \int_{0}^{1} d x = 1, Q_{0, 1}^{+} = \int_{0}^{1} P_{1} (x) d x = \int_{0}^{1} x d x = \frac{1}{2},

and

Q_{0, i}^{+} = \frac{P_{i - 1} (0) - P_{i + 1} (0)}{2 i + 1}, i = 1, 2, \dots,

where

Q_{0, i}^{+} = 0

for even

i \neq 0

. For odd i, we have

Q_{0, i}^{+} = \frac{2 - i}{i + 1} Q_{0, i - 2}^{+}, i = 3, 5, 7, \dots

Thus, we obtain the general expression for calculating

Q_{ν, i}^{-}

and

Q_{ν, i}^{+}

with arbitrary non-negative integers

ν

and i:

Q_{ν, i}^{\pm} = \{\begin{matrix} 1 & for ν = i = 0, \\ \pm \frac{1}{2} & for ν = 0 and i = 1, \\ 0 & for ν = 0 and i = 2, 4, 6, \dots, \\ \frac{2 - i}{i + 1} Q_{0, i - 2}^{\pm} & for ν = 0 and i = 3, 5, 7, \dots, \\ \frac{(i + 1) Q_{ν - 1, i + 1}^{\pm} + i Q_{ν - 1, i - 1}^{\pm}}{2 i + 1} & otherwise, \end{matrix}

so that the expansion coefficients of the density

g_{ν_{1}, ν_{2}}

with respect to Legendre polynomials (6) are expressed as follows (relation (37) is also used here):

\begin{matrix} G_{ν_{1}, ν_{2}, i} & = \int_{- 1}^{1} g_{ν_{1}, ν_{2}} (x) {\hat{P}}_{i} (x) d x \\ = γ_{ν_{1}, ν_{2}} \sqrt{\frac{2 i + 1}{2}} (\int_{- 1}^{0} (1 - {(- x)}^{ν_{1}}) P_{i} (x) d x + \int_{0}^{1} (1 - x^{ν_{2}}) P_{i} (x) d x) \\ = γ_{ν_{1}, ν_{2}} \sqrt{\frac{2 i + 1}{2}} (Q_{0, i}^{-} - {(- 1)}^{ν_{1}} Q_{ν_{1}, i}^{-} + Q_{0, i}^{+} - Q_{ν_{2}, i}^{+}) . \end{matrix}

Expressions for the expansion coefficients of the distribution function

f_{ν_{1}, ν_{2}}

are similarly obtained:

\begin{matrix} F_{ν_{1}, ν_{2}, i} & = \int_{- 1}^{1} f_{ν_{1}, ν_{2}} (x) {\hat{P}}_{i} (x) d x \\ = γ_{ν_{1}, ν_{2}} \sqrt{\frac{2 i + 1}{2}} (\int_{- 1}^{0} [\frac{ν_{1}}{ν_{1} + 1} + x + \frac{{(- x)}^{ν_{1} + 1}}{ν_{1} + 1}] P_{i} (x) d x \\ + \int_{0}^{1} [\frac{ν_{1}}{ν_{1} + 1} + x - \frac{x^{ν_{2} + 1}}{ν_{2} + 1}] P_{i} (x) d x) \\ = γ_{ν_{1}, ν_{2}} \sqrt{\frac{2 i + 1}{2}} (\frac{ν_{1}}{ν_{1} + 1} Q_{0, i}^{-} + Q_{1, i}^{-} + \frac{{(- 1)}^{ν_{1} + 1}}{ν_{1} + 1} Q_{ν_{1} + 1, i}^{-} \\ + \frac{ν_{1}}{ν_{1} + 1} Q_{0, i}^{+} + Q_{1, i}^{+} - \frac{1}{ν_{2} + 1} Q_{ν_{2} + 1, i}^{+}) . \end{matrix}

These expressions for the expansion coefficients of the density

g_{ν_{1}, ν_{2}}

and distribution function

f_{ν_{1}, ν_{2}}

are used for their approximation as

g_{ν_{1}, ν_{2}} (x) \approx g_{ν_{1}, ν_{2}}^{〈 n 〉} (x) = \sum_{i = 0}^{n} G_{ν_{1}, ν_{2}, i} {\hat{P}}_{i} (x), f_{ν_{1}, ν_{2}} (x) \approx f_{ν_{1}, ν_{2}}^{〈 n 〉} (x) = \sum_{i = 0}^{n} F_{ν_{1}, ν_{2}, i} {\hat{P}}_{i} (x),

and for the approximation error:

\begin{matrix} ε_{g}^{〈 n 〉} & = ∥ g_{ν_{1}, ν_{2}} - g_{ν_{1}, ν_{2}}^{〈 n 〉} ∥_{L_{2} (Ω)} = {(\int_{- 1}^{1} g_{ν_{1}, ν_{2}}^{2} (x) d x - \sum_{i = 0}^{n} G_{ν_{1}, ν_{2}, i}^{2})}^{1 / 2}, \\ ε_{f}^{〈 n 〉} & = ∥ f_{ν_{1}, ν_{2}} - f_{ν_{1}, ν_{2}}^{〈 n 〉} ∥_{L_{2} (Ω)} = {(\int_{- 1}^{1} f_{ν_{1}, ν_{2}}^{2} (x) d x - \sum_{i = 0}^{n} F_{ν_{1}, ν_{2}, i}^{2})}^{1 / 2} . \end{matrix}

(39)

5.4. Analysis of Convergence Rate for Expansions of Test Functions

Consider the function

y_{ν}^{-} (x) = \{\begin{matrix} x^{ν} & for x \in [- 1, 0), \\ 0 & for x \in [0, 1], \end{matrix}

where

ν

is a parameter (natural number). Its expansion coefficients

Y_{ν, i}^{-}

with respect to Legendre polynomials (6) are expressed through the previously found values

Q_{ν, i}^{-}

:

y_{ν}^{-} (x) = \sum_{i = 0}^{\infty} Y_{ν, i}^{-} {\hat{P}}_{i} (x), Y_{ν, i}^{-} = \sqrt{\frac{2 i + 1}{2}} Q_{ν, i}^{-} .

Further, we derive the recurrence relation for

Q_{ν, i}^{-}

, different from relation (38). We multiply the left-hand and right-hand sides of relation (1) by

x^{ν}

and integrate over the interval

[- 1, 0]

:

(i + 1) \int_{- 1}^{0} x^{ν} P_{i + 1} (x) d x = (2 i + 1) \int_{- 1}^{0} x^{ν + 1} P_{i} (x) d x - i \int_{- 1}^{0} x^{ν} P_{i - 1} (x) d x .

Next, we use the rule of integration by parts:

\int_{- 1}^{0} x^{ν + 1} P_{i} (x) d x = x^{ν + 1} Θ_{i} (x) |_{- 1}^{0} - (ν + 1) \int_{- 1}^{0} x^{ν} Θ_{i} (x) d x, Θ_{i} (x) = \int_{- 1}^{x} P_{i} (θ) d θ,

and consequently, taking into account equality (17), we obtain

\int_{- 1}^{0} x^{ν + 1} P_{i} (x) d x = - \frac{ν + 1}{2 i + 1} \int_{- 1}^{0} x^{ν} (P_{i + 1} (x) - P_{i - 1} (x)) d x;

therefore,

(i + 1) \int_{- 1}^{0} x^{ν} P_{i + 1} (x) d x = - (ν + 1) \int_{- 1}^{0} x^{ν} (P_{i + 1} (x) - P_{i - 1} (x)) d x - i \int_{- 1}^{0} x^{ν} P_{i - 1} (x) d x,

or

(i + 1) Q_{ν, i + 1}^{-} = - (ν + 1) (Q_{ν, i + 1}^{-} - Q_{ν, i - 1}^{-}) - i Q_{ν, i - 1}^{-},

i.e.,

Q_{ν, i + 1}^{-} = \frac{ν - i + 1}{ν + i + 2} Q_{ν, i - 1}^{-} and Y_{ν, i + 1}^{-} = \sqrt{\frac{2 i + 3}{2 i - 1}} \frac{ν - i + 1}{ν + i + 2} Y_{ν, i - 1}^{-} .

The series formed by the squared expansion coefficients

Y_{ν, i}^{-}

converges since

y_{ν}^{-} \in L_{2} (Ω)

. It can be represented as a sum of two series:

\sum_{i = 0}^{\infty} {(Y_{ν, i}^{-})}^{2} = \sum_{j = 0}^{\infty} {(Y_{ν, 2 j}^{-})}^{2} + \sum_{j = 0}^{\infty} {(Y_{ν, 2 j + 1}^{-})}^{2} .

The Raabe–Duhamel test [24] implies that the first series (similarly for the second one) converges in the same way as the series

\sum_{j = 1}^{\infty} \frac{1}{j^{2 ν + 2}},

since the sequence

R_{j} = j (\frac{{(Y_{ν, 2 j}^{-})}^{2}}{{(Y_{ν, 2 j + 2}^{-})}^{2}} - 1) = j (\frac{4 j - 1}{4 j + 3} {[\frac{ν + 2 j + 2}{ν - 2 j + 1}]}^{2} - 1)

has the limit

lim_{j \to \infty} R_{j} = 2 ν + 2

, but the convergence of this series is equivalent to the convergence of the integral

\int_{1}^{+ \infty} \frac{d t}{t^{2 ν + 2}} = - \frac{1}{(2 ν + 1) t^{2 ν + 1}} |_{1}^{+ \infty} = - \frac{1}{2 ν + 1} (lim_{t \to \infty} \frac{1}{t^{2 ν + 1}} - 1)

which takes place under the condition

2 ν + 1 > 0

, or

ν > - 1 / 2

.

As a result, using Parseval’s identity, we find

∥ y_{ν}^{-} - y_{ν}^{- 〈 n 〉} ∥_{L_{2} (Ω)}^{2} = \sum_{i = n + 1}^{\infty} {(Y_{ν, i}^{-})}^{2} ⩽ \frac{C^{2}}{n^{2 ν + 1}} and {∥ y_{ν}^{-} - y_{ν}^{- 〈 n 〉} ∥}_{L_{2} (Ω)} ⩽ \frac{C}{n^{ν + 1 / 2}}, C > 0,

where

y_{ν}^{- 〈 n 〉} (x) = \sum_{i = 0}^{n} Y_{ν, i}^{-} {\hat{P}}_{i} (x),

and this corresponds to estimate (9) with the limit value

s = ν + 1 / 2

(see Remark 1).

The obtained result can be transferred to the function

y_{ν}^{+} (x) = \{\begin{matrix} 0 & for x \in [- 1, 0), \\ x^{ν} & for x \in [0, 1], \end{matrix}

and its expansion coefficients

Y_{ν, i}^{+}

with respect to Legendre polynomials (6). The easiest way to prove this is the use of the equality

| Y_{ν, i}^{-} | = | Y_{ν, i}^{+} |

, which follows from the relation between expansion coefficients of an arbitrary function from

L_{2} (Ω)

and its reflection [25]. Therefore, the same result holds for the function

g_{ν_{1}, ν_{2}}

. This result can be extended to the function

f_{ν_{1}, ν_{2}}

due to its degree of smoothness is greater by one than the degree of smoothness of

g_{ν_{1}, ν_{2}}

.

Thus,

ε_{g}^{〈 n 〉} \approx \frac{C^{g}}{n^{min {ν_{1}, ν_{2}} + 1 / 2}}, ε_{f}^{〈 n 〉} \approx \frac{C^{f}}{n^{min {ν_{1}, ν_{2}} + 3 / 2}},

(40)

where

C^{g}, C^{f} > 0

are some constants. Moreover, we can assume that

ν_{1}

and

ν_{2}

are real positive numbers, and if we consider

g_{ν_{1}, ν_{2}}

not to be a density but some function not bound by the probability theoretical framework, then the condition

ν_{1}, ν_{2} > - 1 / 2

is admissible. Such a convergence reflects that

g_{ν_{1}, ν_{2}}^{(s)} \in W_{2}^{s} (Ω)

and

f_{ν_{1}, ν_{2}}^{(s)} \in W_{2}^{s + 1} (Ω)

subject to

s < min {ν_{1}, ν_{2}} + 1 / 2

and

sup s = min {ν_{1}, ν_{2}} + 1 / 2

. It corresponds to estimate (9).

6. Numerical Experiments

In this section, we present the results of the joint estimation of the density and distribution function for two examples that use a two-parameter family (30) of densities with different degrees of smoothness. In these examples, the results are presented in the tables that contain the errors of the projection estimates of the density and distribution function in

L_{2} (Ω)

. We study the dependence of error on the projection expansion length (for the maximum degree n of the Legendre polynomials, values 4, 8, 16, 32, and 64 are used, i.e.,

n = 2^{k + 2}

for

k = 0, 1, \dots, 4

), on the sample size

N = 2^{m + 9}

for

m = 0, 1, \dots, 18

, and on the degree of smoothness of the approximated density (see Examples 1 and 2 below). Algorithm 2 is applied for estimation.

These examples show the approximation errors

ε_{g}^{〈 n 〉}

and

ε_{f}^{〈 n 〉}

, which are calculated using Formula (39), i.e., deterministic components of projection estimate errors. In the tables, they are in rows marked by the symbol “∗”. The remaining rows contain errors that include deterministic and stochastic components. The formulae for these errors follow from Parseval’s identity:

\begin{matrix} {\bar{ε}}_{g}^{〈 n 〉} & = {(\int_{- 1}^{1} g_{ν_{1}, ν_{2}}^{2} (x) d x - \sum_{i = 0}^{n} G_{ν_{1}, ν_{2}, i}^{2} + \sum_{i = 0}^{n} {(G_{ν_{1}, ν_{2}, i} - {\bar{G}}_{ν_{1}, ν_{2}, i})}^{2})}^{1 / 2}, \\ {\bar{ε}}_{f}^{〈 n 〉} & = {(\int_{- 1}^{1} f_{ν_{1}, ν_{2}}^{2} (x) d x - \sum_{i = 0}^{n} F_{ν_{1}, ν_{2}, i}^{2} + \sum_{i = 0}^{n} {(F_{ν_{1}, ν_{2}, i} - {\bar{F}}_{ν_{1}, ν_{2}, i})}^{2})}^{1 / 2}, \end{matrix}

where

{\bar{G}}_{ν_{1}, ν_{2}, i}

and

{\bar{F}}_{ν_{1}, ν_{2}, i}

are estimates of expansion coefficients

G_{ν_{1}, ν_{2}, i}

and

F_{ν_{1}, ν_{2}, i}

, respectively. For an arbitrary density g, a similar formula was used to obtain estimate (27).

Example 1.

Let

g_{ν_{1}, ν_{2}}

be the density from a two-parameter family (30), and let

f_{ν_{1}, ν_{2}}

be the related distribution function described by Formula (32), with the following parameters:

ν_{1} = ν

,

ν_{2} = ν + 1

for

ν = 1

. The modeling formula is given in Proposition 1.

The function

g_{1, 2}

is continuous, non-differentiable at

x = 0

, but its first-order derivative exists almost everywhere on Ω:

g_{1, 2} \in C (Ω)

,

g_{1, 2}^{(1)} \in L_{2} (Ω)

, i.e.,

g_{1, 2} \in W_{2}^{1} (Ω)

. However, if we do not restrict ourselves to the space

W_{2}^{s} (Ω)

with natural s (see Remark 1), then

g_{1, 2} \in W_{2}^{s} (Ω)

for

s < min {ν_{1}, ν_{2}} + 1 / 2 = 3 / 2

(

sup s = 3 / 2

). The degree of smoothness of

f_{1, 2}

is greater, and

f_{1, 2} \in W_{2}^{s + 1} (Ω)

. This corresponds to Formulae (40), which take the form

ε_{g}^{〈 n 〉} \approx \frac{C^{g}}{n^{3 / 2}}, ε_{f}^{〈 n 〉} \approx \frac{C^{f}}{n^{5 / 2}}

for given parameters

ν_{1}

and

ν_{2}

.

According to Theorem 1, for the limit value

s = 3 / 2

, to achieve the required approximation accuracy

B (g_{1, 2}, {\bar{g}}_{1, 2}^{〈 n 〉}) = γ ⩽ c N^{- 3 / 8}

,

c > 0

, the conditionally optimal parameters should be as follows:

N_{opt} ≍ n_{opt}^{4}

; this is confirmed by the statistical modeling results (see Table 1 and Table 2). In the row “∗” of Table 1, the deterministic component of error decreases by approximately

2^{3 / 2} \approx 2.8

times (in Table 2, by approximately

2^{5 / 2} \approx 5.6

times) when the projection expansion length n is doubled. In the rest of this table, errors corresponding to optimal parameters

n_{opt}

and

N_{opt}

are highlighted in bold, and they are consistent with the relationship

N_{opt} ≍ n_{opt}^{4}

. Table 2 demonstrates higher accuracy of distribution function estimation.

In our calculations with the formulae for errors, the following values are used (squared norms of the functions

g_{1, 2}

and

f_{1, 2}

):

\int_{- 1}^{1} g_{1, 2}^{2} (x) d x = \frac{156}{245}, \int_{- 1}^{1} f_{1, 2}^{2} (x) d x = \frac{235}{343} .

For clarity, Figure 1 contains the approximation errors in graphical form. One axis shows

k = 0, 1, \dots, 4

, which determines the projection expansion length,

n = 2^{k + 2}

, and another axis shows

m = 0, 1, \dots, 18

, which determines the sample size,

N = 2^{m + 9}

. The vertical axis corresponds to the density approximation errors.

The example under consideration corresponds to the left part of Figure 1, which presents two surfaces. The first one (red) corresponds to the obtained computational error, it is formed on data from Table 1. The second one (blue, with marked nodes) corresponds to the theoretical error according to Formula (28) with

s = 3 / 2

:

ϵ_{g}^{〈 n 〉} \approx \sqrt{\frac{C_{1}^{g} n}{N} + \frac{C_{2}^{g}}{n^{3}}} .

Constants

C_{1}^{g} = 0.885

and

C_{2}^{g} = 0.276

are approximately determined based on the condition of a minimum sum of squared deviations between the theoretical error and computational error.

Example 2.

Let

g_{ν_{1}, ν_{2}}

be the density from a two-parameter family (30), and let

f_{ν_{1}, ν_{2}}

be the related distribution function described by Formula (32), with the following parameters:

ν_{1} = ν + 1

,

ν_{2} = ν

for

ν = 2

. The modeling formula is given in Proposition 2.

The function

g_{3, 2}

is continuous, differentiable everywhere on Ω, and its second-order derivative exists almost everywhere on Ω:

g_{3, 2} \in C^{1} (Ω)

,

g_{3, 2}^{(2)} \in L_{2} (Ω)

; hence,

g_{3, 2} \in W_{2}^{2} (Ω)

. Considering the space

W_{2}^{s} (Ω)

with real non-negative s (see Remark 1), we affirm that

g_{3, 2} \in W_{2}^{s} (Ω)

for

s < min {ν_{1}, ν_{2}} + 1 / 2 = 5 / 2

(

sup s = 5 / 2

). The degree of smoothness of

f_{3, 2}

is greater, and

f_{3, 2} \in W_{2}^{s + 1} (Ω)

. This corresponds to Formula (40) when substituting given parameters

ν_{1}

and

ν_{2}

:

ε_{g}^{〈 n 〉} \approx \frac{C^{g}}{n^{5 / 2}}, ε_{f}^{〈 n 〉} \approx \frac{C^{f}}{n^{7 / 2}} .

Theorem 1 implies that in order to achieve the required approximation accuracy

B (g_{3, 2}, {\bar{g}}_{3, 2}^{〈 n 〉}) = γ ⩽ c N^{- 5 / 12}

,

c > 0

, with the limit value

s = 5 / 2

, the conditionally optimal parameters should satisfy the relationship

N_{opt} ≍ n_{opt}^{6}

, and this is illustrated by the statistical modeling results from Table 3 and Table 4. In the row “∗” of Table 3, if the projection expansion length n is doubled, then the deterministic component of error decreases by approximately

2^{5 / 2} \approx 5.6

times (in Table 4, by approximately

2^{7 / 2} \approx 11.2

times). In the rest of this table, errors corresponding to optimal parameters

n_{opt}

and

N_{opt}

are shown in bold, and they are consistent with the relationship

N_{opt} ≍ n_{opt}^{6}

. Table 4 shows higher accuracy of distribution function estimation.

In our calculations with the formulae for errors, the following values are used (squared norms of the functions

g_{3, 2}

and

f_{3, 2}

):

\int_{- 1}^{1} g_{3, 2}^{2} (x) d x = \frac{5928}{10115}, \int_{- 1}^{1} f_{3, 2}^{2} (x) d x = \frac{7801}{10115} .

Figure 1 contains a graphical representation of the numerical experiment. The meaning of different axes is described in Example 1. This example corresponds to the right part of Figure 1 with two surfaces. The first one (red) is constructed from the obtained computational error given in Table 3, and the second one (blue, with marked nodes) corresponds to the theoretical error according to Formula (28) with

s = 5 / 2

:

ϵ_{g}^{〈 n 〉} \approx \sqrt{\frac{C_{1}^{g} n}{N} + \frac{C_{2}^{g}}{n^{5}}},

where constants

C_{1}^{g} = 0.890

and

C_{2}^{g} = 0.545

are chosen from the same condition as in Example 1.

7. Comparison of Projection Density Estimate and Histogram

The classical method of estimating the density of a random variable is associated with a histogram [18], which is very often used in applied problems. We consider a histogram a projection estimate since the main results of this paper are related to projection estimates.

We can define block pulse functions on the set

Ω

as

\begin{matrix} {\hat{Π}}_{i} (x) = \frac{1}{\sqrt{h}} \{\begin{matrix} 1 & for x \in [i h - 1, (i + 1) h - 1), \\ 0 & for x \notin [i h - 1, (i + 1) h - 1), \end{matrix} \\ i = 0, 1, \dots, L - 1, \end{matrix}

(41)

where L, a natural number, is the number of block pulse functions and

h = 2 / L

. It is advisable to redefine the function

{\hat{Π}}_{L - 1}

in such a way that it becomes continuous on the left at

x = 1

.

Block pulse functions (41) form an orthonormal system of functions in

L_{2} (Ω)

. This system is not complete, but it can be used to approximate an arbitrary function

u \in L_{2} (Ω)

:

u (x) \approx u^{〈 L 〉 *} (x) = \sum_{i = 0}^{L - 1} U_{i}^{*} {\hat{Π}}_{i} (x),

where

U_{i}^{*} = {(u, {\hat{Π}}_{i})}_{L_{2} (Ω)} = \int_{Ω} u (x) {\hat{Π}}_{i} (x) d x, i = 0, 1, \dots, L - 1 .

For

L = 2^{m}

with natural m, the first L Walsh or Haar functions on

Ω

are exactly expressed through block pulse functions (41). Therefore, the results of this section can easily be adapted to the projection estimates of the distribution of a random variable using Walsh or Haar functions that form complete orthonormal systems of functions [26].

The approximation accuracy in

L_{2} (Ω)

is usually estimated as follows [27]:

∥ u - u^{〈 L 〉 *} ∥_{L_{2} (Ω)} ⩽ \frac{C^{*}}{L},

(42)

where

C^{*} > 0

does not depend on L, under the condition

u \in W_{2}^{1} (Ω)

.

As a function u in the given formulae, we can use the density g and the distribution function f. We restrict ourselves to the density g only (the corresponding expansion coefficients are further denoted by

G_{i}^{*}

):

g^{〈 L 〉 *} (x) = \sum_{i = 0}^{L - 1} G_{i}^{*} {\hat{Π}}_{i} (x),

(43)

where

\begin{matrix} G_{i}^{*} & = \int_{Ω} g (x) {\hat{Π}}_{i} (x) d x = \frac{1}{\sqrt{h}} \int_{i h - 1}^{(i + 1) h - 1} g (x) d x \\ = \frac{f ((i + 1) h - 1) - f (i h - 1)}{\sqrt{h}}, i = 0, 1, \dots, L - 1 . \end{matrix}

Thus, the calculation of expansion coefficients

G_{ν_{1}, ν_{2}, i}^{*}

of the density

g_{ν_{1}, ν_{2}}

from a two-parameter family (30) is reduced to the calculation of values of the corresponding distribution function

f_{ν_{1}, ν_{2}}

described by Formula (32), i.e.,

G_{ν_{1}, ν_{2}, i}^{*} = \frac{f_{ν_{1}, ν_{2}} ((i + 1) h - 1) - f_{ν_{1}, ν_{2}} (i h - 1)}{\sqrt{h}}, i = 0, 1, \dots, L - 1 .

To calculate the approximation error for the density

g_{ν_{1}, ν_{2}}

, we can use the formula similar to the first equation of Formula (39):

ε_{g}^{〈 L 〉 *} = {∥ g_{ν_{1}, ν_{2}} - g_{ν_{1}, ν_{2}}^{〈 L 〉 *} ∥}_{L_{2} (Ω)} = {(\int_{- 1}^{1} g_{ν_{1}, ν_{2}}^{2} (x) d x - \sum_{i = 0}^{L - 1} {(G_{ν_{1}, ν_{2}, i}^{*})}^{2})}^{1 / 2} .

(44)

The histogram can be defined by the expression based on approximation (43):

{\bar{g}}^{〈 L 〉 *} (x) = \sum_{i = 0}^{L - 1} {\bar{G}}_{i}^{*} {\hat{Π}}_{i} (x),

where

{\bar{G}}_{i}^{*}

are estimates of expansion coefficients

G_{i}

based on observations of the random variable

ξ

,

i = 0, 1, \dots, L - 1

. For example,

{\bar{G}}_{i}^{*} = \frac{1}{N} \sum_{l = 1}^{N} {\hat{Π}}_{i} (ξ_{l}), i = 0, 1, \dots, L - 1,

where

ξ_{l}

is the lth realization and N is the sample size (number of realizations). The value h depending on L specifies the histogram step.

The error of the histogram

{\bar{g}}^{〈 n 〉 *}

relative to the density g includes deterministic and stochastic components, and it is estimated from below by the value

ε_{g}^{〈 L 〉 *}

:

{\bar{ε}}_{g}^{〈 L 〉 *} = {∥ g_{ν_{1}, ν_{2}} - {\bar{g}}_{ν_{1}, ν_{2}}^{〈 L 〉 *} ∥}_{L_{2} (Ω)} ⩾ ε_{g}^{〈 L 〉 *} .

The results of calculations using Formula (44) for both densities from Examples 1 and 2 are given in Table 5 and Table 6, respectively. They should be compared with the rows “∗” in Table 1 and Table 3. Such a comparison shows the undoubted advantage of projection density estimates using Legendre polynomials. The approximation accuracy when using block pulse functions for

L = 64

corresponds to the approximation accuracy when using Legendre polynomials for

n = 8

in Example 1 and for

n = 4

in Example 2. If the number of block pulse functions L is doubled, then the deterministic component of error decreases by approximately 2 times, and this conclusion does not depend on the degree of smoothness of the estimated density.

The problem of the conditional optimization of the algorithm for obtaining the histogram was solved in [1]. In this problem, optimal parameters satisfy the relationship

N_{opt} ≍ L_{opt}^{3}

which corresponds to inequality (42). A generalization of this result in the context of stochastic differential equations can be found in [3]. Choosing parameters in this way, the computational error

{\bar{ε}}_{g}^{〈 L 〉 *}

will be approximately twice as large as its deterministic component

ε_{g}^{〈 L 〉 *}

, since the conditional optimization of the algorithm for obtaining the histogram assumes the equalization of deterministic and stochastic components.

For densities from Examples 1 and 2, to reduce the approximation error using the histogram by 2 times, it is necessary to increase the number of block pulse functions L by 2 times and the sample size N by

2^{3} = 8

times. The projection estimates of densities using Legendre polynomials are more effective in these examples. To reduce the approximation error by 2 times in Example 1, it suffices to increase the projection expansion length n by

2^{2 / 3} \approx 1.587

times and the sample size N by

{(2^{2 / 3})}^{4} = 2^{8 / 3} \approx 6.350

times. In Example 2, it suffices to increase the projection expansion length n by

2^{2 / 5} \approx 1.320

times and the sample size N by

{(2^{2 / 5})}^{6} = 2^{12 / 5} \approx 5.278

times. In this case, increasing n and N implies their subsequent rounding-up.

8. Conclusions

A distinctive feature of the algorithms proposed in this paper is the jointly obtaining of the projection estimates of the density and distribution function of a random variable. In addition, the problem of the conditional optimization of the algorithms is solved under the assumption that the estimated functions belong to the Sobolev–Slobodetskij space. This assumption allows one to use a more precise estimate of the convergence rate for projection expansions compared with the case where the ordinary Sobolev space or the space of continuously differentiable functions are used. Numerical experiments confirm the obtained theoretical results.

The proposed algorithms and conditionally optimal relationships between parameters ensure the minimum complexity when obtaining projection estimates to achieve the required approximation accuracy. These algorithms are important for studying the properties of solutions to stochastic differential equations [28,29], including stochastic differential equations with a Poisson component [6,30] or with the switching right-hand side [31,32].

Author Contributions

Conceptualization, T.A. and K.R.; methodology, T.A. and K.R.; software, K.R.; validation, T.A.; formal analysis, T.A. and K.R.; writing—original draft preparation, T.A. and K.R.; writing—review and editing, T.A. and K.R.; visualization, K.R.; funding acquisition, T.A. All authors have read and agreed to the published version of the manuscript.

Funding

This research study was carried out within the framework of the state assignment to the Institute of Computational Mathematics and Mathematical Geophysics of the Siberian Branch of the Russian Academy of Sciences (project FWNM-2025-0002).

Data Availability Statement

The source code is available at https://github.com/rkoffice/estimation (accessed on 22 July 2025).

Conflicts of Interest

The authors declare no conflicts of interest.

References

Chentsov, N.N. A bound for an unknown distribution density in terms of the observations. In Doklady Akademii Nauk; Russian Academy of Sciences: Moscow, USSR, 1962; Volume 147, pp. 45–48. [Google Scholar]
Mikhailov, G.A. Randomized Monte Carlo algorithms for problems with random parameters (“Double randomization” method). Numer. Anal. Appl. 2019, 12, 155–165. [Google Scholar] [CrossRef]
Averina, T. Conditional optimization of algorithms for estimating distributions of solutions to stochastic differential equations. Mathematics 2024, 12, 586. [Google Scholar] [CrossRef]
Averina, T.A.; Rybakov, K.A. Solving approximately a prediction problem for stochastic jump-diffusion systems. Numer. Anal. Appl. 2017, 10, 1–10. [Google Scholar] [CrossRef]
Rudenko, E.A. Operational absolutely optimal dynamic control of the stochastic differential plant’s state by its output. J. Comput. Syst. Sci. Int. 2023, 62, 233–247. [Google Scholar] [CrossRef]
Averina, T.A.; Rybakov, K.A. Using maximum cross section method for filtering jump-diffusion random processes. Russ. J. Numer. Anal. Math. Model. 2020, 35, 55–67. [Google Scholar] [CrossRef]
Panteleev, A.; Karane, M. Application of a novel multi-agent optimization algorithm based on PID controllers in stochastic control problems. Mathematics 2023, 11, 2903. [Google Scholar] [CrossRef]
Mikhailov, G.A. Weighted Monte Carlo Methods; Siberian Branch of the Russian Academy of Sciences: Novosibirsk, Russia, 2000. [Google Scholar]
Suetin, P.K. Classical Orthogonal Polynomials; Nauka: Moscow, USSR, 1979. [Google Scholar]
Triebel, H. Theory of Function Spaces; Birkhäuser: Basel, Switzerland, 2010. [Google Scholar]
Canuto, C.; Hussaini, M.Y.; Quarteroni, A.; Zang, T.A. Spectral Methods. Fundamentals in Single Domains; Springer: Berlin/Heidelberg, Germany, 2006. [Google Scholar]
Schwab, C. p- and hp- Finite Element Methods: Theory and Applications in Solid and Fluid Mechanics; Oxford University Press: Oxford, UK, 1998. [Google Scholar]
Houston, P.; Schwab, C.; Süli, E. Discontinuous hp-finite element methods for advection-diffusion-reaction problems. SIAM J. Numer. Anal. 2002, 39, 2133–2163. [Google Scholar] [CrossRef]
Canuto, C.; Quarteroni, A. Approximation results for orthogonal polynomials in Sobolev spaces. Math. Comput. 1982, 38, 67–86. [Google Scholar] [CrossRef]
Agranovich, M.S. Sobolev Spaces, Their Generalizations and Elliptic Problems in Smooth and Lipschitz Domains; Springer: Cham, Switzerland, 2015. [Google Scholar]
Slobodetskij, L.N. SL Sobolev’s spaces of fractional order and their application to boundary problems for partial differential equations. In Doklady Akademii Nauk; Russian Academy of Sciences: Moscow, USSR, 1958; Volume 118, pp. 243–246. [Google Scholar]
Edmunds, D.E.; Evans, W.D. Fractional Sobolev Spaces and Inequalities; Cambridge University Press: Cambridge, UK, 2023. [Google Scholar]
Cramér, H. Mathematical Methods of Statistics; Princeton University Press: Princeton, NJ, USA, 1999. [Google Scholar]
Chentsov, N.N. Statistical Decision Rules and Optimal Inference; American Mathematical Society: Providence, RI, USA, 1982. [Google Scholar]
Ibragimov, I.A. A generalization of N. N. Chentsov’s projection estimates. J. Math. Sci. 2015, 204, 116–133. [Google Scholar] [CrossRef]
Gikhman, I.I.; Skorokhod, A.V. Introduction to the Theory of Random Processes; Dover Publications: New York, NY, USA, 1997. [Google Scholar]
Mikhailov, G.A.; Voitishek, A.V. Numerical Statistical Modeling. Monte Carlo Methods; Academia: Moscow, Russia, 2006. [Google Scholar]
Korn, G.A.; Korn, T.M. Mathematical Handbook for Scientists and Engineers; Dover Publications: New York, NY, USA, 2000. [Google Scholar]
Fichtenholz, G.M. Course of Differential and Integral Calculus; Fizmatlit: Moscow, Russia, 2006; Volume 2. [Google Scholar]
Rybakov, K. Spectral representation and simulation of fractional Brownian motion. Computation 2025, 13, 19. [Google Scholar] [CrossRef]
Golubov, B.; Efimov, A.; Skvortsov, V. Walsh Series and Transforms: Theory and Applications; Kluwer Academic Publishers: Dordrecht, The Netherlands, 1991. [Google Scholar]
Marchouk, G.; Agochkov, V. Introduction aux Méthodes des Éléments Finis; Éditions Mir: Moscow, USSR, 1986. [Google Scholar]
Averina, T.A.; Rybakov, K.A. Rosenbrock-type methods for solving stochastic differential equations. Numer. Anal. Appl. 2024, 17, 99–115. [Google Scholar] [CrossRef]
Giordano, C. A deterministic-stochastic hybrid integrator for random ordinary differential equations with aerospace applications. Aerospace 2025, 12, 397. [Google Scholar] [CrossRef]
Song, Y.; Wu, Z. The maximum principle for stochastic control problem with jumps in progressive structure. J. Optim. Theory Appl. 2023, 199, 415–438. [Google Scholar] [CrossRef]
Averina, T.A.; Rybakov, K.A. Comparison of a statistical simulation method and a spectral method for analysis of stochastic multistructure systems with distributed transitions. Russ. J. Numer. Anal. Math. Model. 2007, 22, 431–447. [Google Scholar] [CrossRef]
Borisov, A. Regime tracking in markets with Markov switching. Mathematics 2024, 12, 423. [Google Scholar] [CrossRef]

Figure 1. The approximation errors for densities (the left part for

g_{1, 2}

from Example 1 and the right part for

g_{3, 2}

from Example 2).

Figure 1. The approximation errors for densities (the left part for

g_{1, 2}

from Example 1 and the right part for

g_{3, 2}

from Example 2).

Table 1. The approximation errors

ε_{g}^{〈 n 〉}

(in the row “∗”) and

{\bar{ε}}_{g}^{〈 n 〉}

(in the remaining rows) for sample sizes

N = 2^{m + 9}

(Example 1).

Table 1. The approximation errors

ε_{g}^{〈 n 〉}

(in the row “∗”) and

{\bar{ε}}_{g}^{〈 n 〉}

(in the remaining rows) for sample sizes

N = 2^{m + 9}

(Example 1).

m	$n = 4$	$n = 8$	$n = 16$	$n = 32$	$n = 64$
∗	0.024614	0.009800	0.003838	0.001442	0.000527
0	0.065300	0.068625	0.119784	0.168044	0.209996
1	0.029999	0.020867	0.064303	0.107275	0.144097
2	0.026360	0.031022	0.046008	0.061732	0.093907
3	0.035117	0.028862	0.036685	0.045302	0.064636
4	0.028603	0.020236	0.026896	0.035032	0.050834
5	0.025879	0.015305	0.018405	0.025655	0.034847
6	0.025867	0.014401	0.014068	0.017283	0.023956
7	0.026084	0.015807	0.014762	0.017245	0.021088
8	0.024814	0.011465	0.008550	0.010850	0.015997
9	0.024640	0.010035	0.005534	0.006907	0.010124
10	0.024638	0.009976	0.004925	0.005121	0.007505
11	0.024622	0.009888	0.004352	0.003211	0.005116
12	0.024616	0.009815	0.004112	0.002461	0.003081
13	0.024616	0.009814	0.003937	0.002064	0.002367
14	0.024616	0.009807	0.003881	0.001813	0.001689
15	0.024615	0.009802	0.003852	0.001623	0.001308
16	0.024615	0.009803	0.003851	0.001553	0.001093
17	0.024615	0.009802	0.003852	0.001503	0.000830
18	0.024615	0.009801	0.003845	0.001470	0.000647

Table 2. The approximation errors

ε_{f}^{〈 n 〉}

(in the row “∗”) and

{\bar{ε}}_{f}^{〈 n 〉}

(in the remaining rows) for sample sizes

N = 2^{m + 9}

(Example 1).

Table 2. The approximation errors

ε_{f}^{〈 n 〉}

(in the row “∗”) and

{\bar{ε}}_{f}^{〈 n 〉}

(in the remaining rows) for sample sizes

N = 2^{m + 9}

(Example 1).

m	$n = 4$	$n = 8$	$n = 16$	$n = 32$	$n = 64$
∗	0.005772	0.001069	0.000198	0.000036	$6.48 \cdot 10^{- 6}$
0	0.027351	0.026935	0.027769	0.028143	0.028244
1	0.010817	0.009287	0.010189	0.010621	0.010828
2	0.006060	0.004198	0.004744	0.005001	0.005172
3	0.008635	0.007038	0.007179	0.007255	0.007313
4	0.006667	0.004018	0.004166	0.004234	0.004284
5	0.006149	0.002746	0.002680	0.002754	0.002788
6	0.006159	0.002666	0.002519	0.002544	0.002561
7	0.006186	0.002722	0.002571	0.002594	0.002604
8	0.005845	0.001519	0.001160	0.001182	0.001205
9	0.005784	0.001152	0.000535	0.000539	0.000558
10	0.005801	0.001230	0.000661	0.000647	0.000656
11	0.005783	0.001140	0.000449	0.000411	0.000420
12	0.005773	0.001081	0.000263	0.000183	0.000186
13	0.005775	0.001087	0.000282	0.000210	0.000209
14	0.005773	0.001080	0.000250	0.000161	0.000158
15	0.005772	0.001070	0.000202	0.000060	0.000051
16	0.005772	0.001072	0.000212	0.000086	0.000079
17	0.005772	0.001071	0.000208	0.000075	0.000067
18	0.005772	0.001070	0.000200	0.000046	0.000030

Table 3. The approximation errors

ε_{g}^{〈 n 〉}

(in the row “∗”) and

{\bar{ε}}_{g}^{〈 n 〉}

(in the remaining rows) for sample sizes

N = 2^{m + 9}

(Example 2).

Table 3. The approximation errors

ε_{g}^{〈 n 〉}

(in the row “∗”) and

{\bar{ε}}_{g}^{〈 n 〉}

(in the remaining rows) for sample sizes

N = 2^{m + 9}

(Example 2).

m	$n = 4$	$n = 8$	$n = 16$	$n = 32$	$n = 64$
∗	0.009757	0.001782	0.000327	0.000059	0.000011
0	0.095213	0.096279	0.110003	0.163112	0.202505
1	0.069075	0.072595	0.077403	0.101639	0.156190
2	0.037189	0.042272	0.055580	0.075076	0.109453
3	0.019912	0.020148	0.028046	0.044567	0.077312
4	0.015445	0.015796	0.018872	0.035310	0.052812
5	0.010799	0.007478	0.014730	0.021591	0.037072
6	0.011292	0.007615	0.010409	0.013876	0.026574
7	0.012527	0.009047	0.011426	0.013697	0.021402
8	0.010221	0.004889	0.006611	0.008514	0.014083
9	0.009919	0.002619	0.003359	0.004879	0.008398
10	0.009851	0.002347	0.003053	0.004197	0.005689
11	0.009779	0.002255	0.001818	0.002358	0.003913
12	0.009783	0.002329	0.001788	0.001999	0.003051
13	0.009766	0.001968	0.001211	0.001580	0.002321
14	0.009769	0.001911	0.000960	0.001176	0.001615
15	0.009761	0.001839	0.000798	0.000778	0.001129
16	0.009757	0.001793	0.000472	0.000455	0.000766
17	0.009757	0.001789	0.000409	0.000425	0.000553
18	0.009758	0.001789	0.000382	0.000298	0.000388

Table 4. The approximation errors

ε_{f}^{〈 n 〉}

(in the row “∗”) and

{\bar{ε}}_{f}^{〈 n 〉}

(in the remaining rows) for sample sizes

N = 2^{m + 9}

(Example 2).

Table 4. The approximation errors

ε_{f}^{〈 n 〉}

(in the row “∗”) and

{\bar{ε}}_{f}^{〈 n 〉}

(in the remaining rows) for sample sizes

N = 2^{m + 9}

(Example 2).

m	$n = 4$	$n = 8$	$n = 16$	$n = 32$	$n = 64$
∗	0.002779	0.000139	0.000014	$1.40 \cdot 10^{- 6}$	$1.35 \cdot 10^{- 7}$
0	0.050042	0.050077	0.050178	0.050338	0.050393
1	0.032134	0.032449	0.032474	0.032568	0.032637
2	0.020694	0.020649	0.020779	0.020854	0.020905
3	0.010729	0.010445	0.010513	0.010577	0.010651
4	0.007291	0.006854	0.006890	0.006980	0.007017
5	0.003914	0.002828	0.002946	0.003002	0.003052
6	0.003984	0.002933	0.002982	0.002999	0.003031
7	0.004923	0.004130	0.004160	0.004169	0.004182
8	0.003304	0.001842	0.001872	0.001879	0.001893
9	0.002904	0.000866	0.000874	0.000883	0.000893
10	0.002803	0.000414	0.000439	0.000454	0.000460
11	0.002790	0.000322	0.000301	0.000306	0.000311
12	0.002792	0.000341	0.000320	0.000322	0.000325
13	0.002782	0.000208	0.000167	0.000171	0.000174
14	0.002783	0.000215	0.000169	0.000170	0.000171
15	0.002780	0.000171	0.000108	0.000107	0.000108
16	0.002779	0.000147	0.000053	0.000052	0.000053
17	0.002779	0.000144	0.000043	0.000043	0.000043
18	0.002780	0.000158	0.000076	0.000075	0.000075

Table 5. The approximation error

ε_{g}^{〈 L 〉 *}

(Example 1).

Table 5. The approximation error

ε_{g}^{〈 L 〉 *}

(Example 1).

$n = 4$	$n = 8$	$n = 16$	$n = 32$	$n = 64$
0.186263	0.094153	0.047203	0.023618	0.011811

Table 6. The approximation error

ε_{g}^{〈 L 〉 *}

(Example 2).

Table 6. The approximation error

ε_{g}^{〈 L 〉 *}

(Example 2).

$n = 4$	$n = 8$	$n = 16$	$n = 32$	$n = 64$
0.171305	0.089038	0.044945	0.022526	0.011270

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Averina, T.; Rybakov, K. Analysis and Conditional Optimization of Projection Estimates for Distribution of Random Variable Using Legendre Polynomials. Algorithms 2025, 18, 466. https://doi.org/10.3390/a18080466

AMA Style

Averina T, Rybakov K. Analysis and Conditional Optimization of Projection Estimates for Distribution of Random Variable Using Legendre Polynomials. Algorithms. 2025; 18(8):466. https://doi.org/10.3390/a18080466

Chicago/Turabian Style

Averina, Tatyana, and Konstantin Rybakov. 2025. "Analysis and Conditional Optimization of Projection Estimates for Distribution of Random Variable Using Legendre Polynomials" Algorithms 18, no. 8: 466. https://doi.org/10.3390/a18080466

APA Style

Averina, T., & Rybakov, K. (2025). Analysis and Conditional Optimization of Projection Estimates for Distribution of Random Variable Using Legendre Polynomials. Algorithms, 18(8), 466. https://doi.org/10.3390/a18080466

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Analysis and Conditional Optimization of Projection Estimates for Distribution of Random Variable Using Legendre Polynomials

Abstract

1. Introduction

2. Using Legendre Polynomials for Randomized Projection Estimates

2.1. Fourier–Legendre Series

2.2. Projection Estimates of Density and Distribution Functions

2.3. Expansion Coefficients of Density and Distribution Function

3. Algorithms for Jointly Obtaining Randomized Projection Estimates of Density and Distribution Functions

4. Analysis and Conditional Optimization of Randomized Projection Estimates

5. Two-Parameter Family of Densities with Different Degrees of Smoothness

5.1. Densities with Different Degrees of Smoothness and Related Distribution Functions

5.2. Modeling Random Variables with Given Test Distributions Using Monte Carlo Method

5.3. Expansion Coefficients of Test Functions (Fourier–Legendre Series)

5.4. Analysis of Convergence Rate for Expansions of Test Functions

6. Numerical Experiments

7. Comparison of Projection Density Estimate and Histogram

8. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI