Adaptive Nonparametric Density Estimation with B-Spline Bases

Zhao, Yanchun; Zhang, Mengzhu; Ni, Qian; Wang, Xuhui

doi:10.3390/math11020291

Open AccessEditor’s ChoiceArticle

Adaptive Nonparametric Density Estimation with B-Spline Bases

by

Yanchun Zhao

¹

,

Mengzhu Zhang

¹,

Qian Ni

² and

Xuhui Wang

^3,*

¹

School of Mathematics, Hefei University of Technology, Hefei 230009, China

²

School of Physical and Mathematical Sciences, Nanjing Tech University, Nanjing 211816, China

³

Department of Mathematics, Hohai University, Nanjing 211100, China

^*

Author to whom correspondence should be addressed.

Mathematics 2023, 11(2), 291; https://doi.org/10.3390/math11020291

Submission received: 4 December 2022 / Revised: 28 December 2022 / Accepted: 30 December 2022 / Published: 5 January 2023

(This article belongs to the Section D1: Probability and Statistics)

Download

Browse Figures

Versions Notes

Abstract

:

Learning density estimation is important in probabilistic modeling and reasoning with uncertainty. Since B-spline basis functions are piecewise polynomials with local support, density estimation with B-splines shows its advantages when intensive numerical computations are involved in the subsequent applications. To obtain an optimal local density estimation with B-splines, we need to select the bandwidth (i.e., the distance of two adjacent knots) for uniform B-splines. However, the selection of bandwidth is challenging, and the computation is costly. On the other hand, nonuniform B-splines can improve on the approximation capability of uniform B-splines. Based on this observation, we perform density estimation with nonuniform B-splines. By introducing the error indicator attached to each interval, we propose an adaptive strategy to generate the nonuniform knot vector. The error indicator is an approximation of the information entropy locally, which is closely related to the number of kernels when we construct the nonuniform estimator. The numerical experiments show that, compared with the uniform B-spline, the local density estimation with nonuniform B-splines not only achieves better estimation results but also effectively alleviates the overfitting phenomenon caused by the uniform B-splines. The comparison with the existing estimation procedures, including the state-of-the-art kernel estimators, demonstrates the accuracy of our new method.

Keywords:

adaptive strategy; B-spline; nonparametric density estimation; error indictor

MSC:

62G05

1. Introduction

Nonparametric density estimation avoids the parametric assumptions in probabilistic modeling and reasoning, which achieves flexibility in data modeling while reducing the risk of model misspecification [1,2]. Hence, nonparametric density estimation is an important research area in statistics. In this paper, we focus on the estimation of univariate density functions, which is a classic problem in nonparametric statistics. We perform density estimation with nonuniform B-splines. By introducing the error indicator attached to each interval for density estimation, we propose an adaptive strategy to generate a nonuniform knot vector.

There are four main techniques for nonparametric estimation, i.e., histograms, orthogonal series, kernels, and splines. Histograms transform the continuous data into discrete data, while important information may be lost during the discretization process [3]. Kernel density estimation is one of the most famous methods for density estimation, which still remains an active research area (see [4] and references therein). In addition to kernel density estimators, orthogonal series estimators are also widely used, e.g., [5,6,7,8]. Terrell and Scott investigated some of the possibilities for the improvement of univariate and multivariate kernel density estimates by varying the window over the domain of estimation, pointwise and globally [9]. However, the global nature of orthogonal series estimators limits their applications.

Since B-spline basis functions possess the property of local support, local density estimators based on uniform B-splines have been discussed in [10,11,12,13]. In addition to the local support property, B-spline basis functions are also piecewise polynomials, which demonstrates their advantages where intensive numerical computations have been conducted after estimation [13,14,15]. In addition, local density estimators based on different splines have also been studied, e.g., logsplines [16,17], smoothing splines [18], penalized B-splines [19], and shape-constrained splines [20]. Recently, a Galerkin method was introduced to compute a B-spline estimator [21].

The important part of any basis estimation procedure is the bandwidth selection method. The existing literature on bandwidth is quite rich (see [21,22,23,24,25] and references therein). It should be noted that the least squares cross-validation (LSCV) formula in closed form was proposed in [21], which can be used to determine the bandwidth efficiently.

In this paper, we use nonuniform B-splines as density estimators. By introducing the local error indicator attached to the interval, we design an adaptive refinement strategy, which increases the approximation capability of the local density estimator. The numerical experiments show that our adaptive local density estimation produces a smaller approximation error than that with uniform B-splines. Comparison with the state-of-the-art density estimation methods shows that our adaptive method can approximate data with a comparable squared error and a significantly smaller absolute error than other kernel density estimators.

The remainder of the paper is organized as follows. The next section reviews the definition of B-spline basis functions and their corresponding piecewise polynomial space. In Section 3, we detail the proposed method for density estimation based on B-splines. In Section 4, we propose an adaptive knot refinement strategy. Numerical experiments are provided in Section 5. Finally, Section 6 ends with conclusions.

2. B-Splines

Given a knot vector

U = [u_{1}, \dots, u_{n + k + 1}]

, where

u_{1} \leq u_{2} \leq \dots \leq u_{n + k + 1}

and

u_{i} < u_{i + k + 1}, i = 1, \dots, n

, the B-spline basis functions of degree k (order

k + 1

) are defined in a recursive fashion [26]:

\begin{matrix} N_{i, 0} (x) = & \{\begin{matrix} 1, u_{i} \leq x < u_{i + 1}, \\ 0, o t h e r w i s e . \end{matrix} \\ N_{i, j} (x) = & \frac{x - u_{i}}{u_{i + j} - u_{i}} N_{i, k - 1} (x) + \frac{u_{i + j + 1} - x}{u_{i + j + 1} - u_{i + 1}} N_{i + 1, j - 1} (x), j = 1, \dots, k . \end{matrix}

(1)

The B-spline basis functions

N_{i, k} (x)

are nonnegative and have local support (i.e.,

N_{i, k} (x)

is a nonzero polynomial on

[u_{i}, u_{i + k + 1})

). In addition, the B-spline basis functions form a partition of unity for

x \in [u_{k}, u_{n + 1}]

[27]. Figure 1 shows the cubic and quadratic B-spline basis functions defined in

[0, 10]

.

A B-spline of degree k is defined [28] as

p (x) = \sum_{i = 1}^{n} d_{i} N_{i, k} (x),

(2)

where

d_{i} \in R

is the i-th control coefficient, and

N_{i, k} (x)

is the i-th B-spline basis function, which is defined based on the knot vector

U = [u_{1}, \dots, u_{n + k + 1}]

. A B-spline whose knot vector is evenly spaced is known as a uniform B-spline. Otherwise, the B-spline is called a nonuniform B-spline.

Let

Δ

be a sequence of distinct real numbers

Δ = {η_{0} < η_{1} < \dots < η_{l + 1}},

where

η_{0} = u_{k}, η_{l + 1} = u_{n + 1}

,

u_{k + 1}, \dots, u_{n} : = \underset{k - r_{1}}{\underset{⏟}{η_{1}, \dots, η_{1}}}, \dots, \underset{k - r_{l}}{\underset{⏟}{η_{l}, \dots, η_{l}}},

which forms a partition of the interval

[u_{k}, u_{n + 1}]

. The space

S_{k}^{r} (Δ)

of piecewise polynomials of degree k with smoothness

r = (r_{1}, \dots, r_{l})

over the partition

Δ

is defined by

S_{k}^{r} (Δ) = {p \in P_{k} ([η_{i}, η_{i + 1})), i = 0, \dots, l, p \in C^{r_{i}} (η_{i})), i = 1, \dots, l},

where

P_{k}

is the space of all the polynomials of degree k, and

C^{r_{i}}

is the space consisting of all the functions that are continuous at

η_{i}

with order

r_{i}

.

Lemma 1

([29]). Given the knot vector U, the set of B-spline basis functions defined as (1) forms an alternative basis for the piecewise polynomial space

S_{k}^{r} (Δ)

.

Moreover, the approximation capabilities of

S_{k}^{r} (Δ)

to a sufficiently smooth function defined over

[u_{k}, u_{n + 1}]

were described in [30]. It was shown that a sufficiently smooth function can be approximated with good accuracy by B-splines.

3. Density Estimation with B-Splines

Let

{x_{i}}_{i = 1}^{N}

be an independent identically distributed random sample from a continuous probability density function f,

X \sim f (x)

. Zong proposed a method to find B-spline estimates of a one-dimensional and two-dimensional probability density function from a sample [31].

We define an estimate

\hat{f} (x | α)

of

f (x)

in the form:

f (x) \approx \hat{f} (x | α) : = \sum_{j = 1}^{n} α_{j} N_{j, k} (x),

(3)

where

α = (α_{1}, \dots, α_{n}) \in R^{n + 1}

are coefficients. To fix the estimate

\hat{f} (x | α)

in the form of B-splines, we need to specify the degree k, the basis functions

N_{j, k} (x)

, and the coefficients

α_{j}, j = 1, \dots, n

.

3.1. Selecting the Degree and the Knot Vector

The degree k and the knot vector U need to be specified a priori to determine the basis functions. Based on the approximation ability and flexibility of the quadratic and cubic B-splines, quadratic or cubic B-splines are usually chosen for local density estimation [21,31,32]. The selection of the knot vector is challenging and time consuming. Even when we restrict the knot vector to a uniform case, we still need to specify:

(1): $η_{0}$ and $η_{l + 1}$ (i.e., $u_{k}$ , $u_{n + 1}$ in the knot vector), which determines the endpoints of the interval of the piecewise polynomial space $S_{k}^{r} (Δ)$ ;
(2): the bandwidth $h = u_{i + 1} - u_{i}$ .

To ensure that all the sample values are in the interval

[η_{0}, η_{l + 1}]

, we can set

η_{0} = a_{0} - γ (b_{0} - a_{0}), η_{l + 1} = a_{0} + γ (b_{0} - a_{0})

where

a_{0} = min \{x_{1}, x_{2}, \dots, x_{N}\}

,

b_{0} = max \{x_{1}, x_{2}, \dots, x_{N}\}

, and

γ

is a parameter to control the length of the interval. In the numerical experiments,

γ

is set to be 0.01 in general. Note that the values

a_{0}, b_{0}

can be obtained by passing through data at a cost

O (N)

.

The selection of the optimal bandwidth is generally based on the score of the estimated model. A penalized likelihood score is chosen to perform selection in a principled way, e.g., the Bayesian information criterion (BIC) and measured entropy (ME) scores are adopted to select the bandwidth with the highest score [31,32], where

B I C : = - 2 \sum_{s = 1}^{N} log \hat{f} (x_{s} | α) + n log (N),

(4)

M E : = - \int \hat{f} (x | α) log \hat{f} (x | α)) d x + \frac{3 (n - 1)}{2 N} .

(5)

Note that ME is an asymptotically unbiased estimate of the information entropy. The information entropy of the real model f measured by the estimation

\hat{f}

is defined as

H (f, \hat{f}) = - \int f (x) log \hat{f} (x | α) d x .

(6)

3.2. Computing the Coefficients

When the degree and the knot vector are specified, we need to compute the coefficients

α_{1}, \dots, α_{n}

to obtain the B-spline estimation. In addition, two constraints need to be considered to ensure the resulting

\hat{f} (x | α) : = \sum_{j = 1}^{n} α_{j} N_{j, k} (x)

is a valid probability density function, i.e.,

(1): $α_{j} \geq 0, j = 1, \dots, n$ , so that ${\hat{f}}_{X} (x | α)$ is always positive in the distribution range.
(2): $\int_{- \infty}^{\infty} {\hat{f}}_{X} (x | α) d x = 1$ , which can be simplified to

$\int_{- \infty}^{\infty} \hat{f} (x | α) d x = \sum_{j = 1}^{n} α_{j} \int_{- \infty}^{\infty} N_{j, k} (x) d x = \sum_{j = 1}^{n} α_{j} \frac{u_{j} - u_{j - k - 1}}{k + 1} = 1 .$

The coefficients

α

can be calculated based on the maximum likelihood method, which can be formulated as a constrained optimization problem:

\begin{matrix} max_{α \in R^{n}} & \sum_{j = 1}^{N} log \hat{f} (x_{j} | α) \\ such that & α_{1}, \dots, α_{n} \geq 0, \\ \sum_{j = 1}^{n} α_{j} \frac{u_{j} - u_{j - k - 1}}{(k + 1)} = 1 . \end{matrix}

(7)

The constrained optimization problem (7) can be calculated efficiently by an iterative procedure [31]:

{α_{j}}^{(q)} = \frac{1}{N} \frac{k + 1}{(u_{j} - u_{j - k - 1})} \times \sum_{s = 1}^{N} \frac{α_{j}^{(q - 1)} N_{j, 2} (x_{s})}{\hat{f} {(x_{s} | α)}^{(q - 1)}}, j = 1, \dots, n,

(8)

where q represents the iteration number in the optimization process. The initial values

α_{j}^{0}

are set to

(k + 1) / \sum_{j = 1}^{n} (u_{j} - u_{j - k - 1})

.

4. Knot Refinement

Uniform B-splines may fail to capture the details of the input dataset; it has been shown that the suitable placement of knots can improve the approximation capability of B-splines dramatically [33]. Hence, we focused on the adaptive generation of the knot vectors for density estimation in this paper.

We started with a coarse uniform knot vector; the adaptive procedure consisted of successive loops of the form:

Compute coefficients \to Estimate error \to Mark and refine .

The computation for the coefficients can be accomplished by (8). The essential part of the loops is the error estimate step. The error estimate methods with a posteriori error control are well developed in numerical analysis (see [34,35] and references therein for examples). We followed up on the ideas presented by [35,36] to derive a posteriori-based error estimator based on B-splines.

4.1. A Residual-Based Posteriori Error Estimator Based on B-Splines

We aimed to refine only those intervals

I_{i} = [u_{i}, u_{i + 1}]

, which contributed significantly to the error

f (x) - \hat{f} (x | α)

. However, since the true density function

f (x)

was unknown, we defined a local error indicator attached with the interval

I_{i}

as follows:

τ_{i} = - \sum_{j = 1}^{N_{i}} \frac{1}{N_{i}} log \hat{f} (x_{i}^{j} | α),

(9)

where

x_{i}^{1}, \dots, x_{i}^{N_{i}}

are all the sampling points in

{x_{j}}_{j = 1}^{N}

located in the interval

I_{i}

. Note that

τ_{i}

is an estimate of the information entropy restricted on the interval

I_{i}

:

H (f, \hat{f}) |_{I_{i}} = - \int_{I_{i}} f (x) log \hat{f} (x | α) d x .

4.2. Adaptive Refinement Strategy

Inspired by the adaptive refinement strategy, in the numerical analysis, we introduced the adaptive refinement strategy to compute a sequence of estimates that converged to the true probability density function. As the error indicator for each interval was available, we marked each interval

I_{i}

to be refined that had a large error. In order to find the intervals with a large error efficiently, we adopted the refinement strategy given in [37], with a slight modification as Algorithm 1.

Algorithm 1: Refinement algorithm

We implemented the adaptive strategy in this paper as in Algorithm 2.

Algorithm 2: Adaptive probability density function estimation

Input:: An observed sample ${x_{i}}_{i = 1}^{N}$ ; the degree of B-spline k; the number of initial interior knots m.

Output:: A probability density function: ${\hat{f}}_{X} (x | α) = \sum_{j = 1}^{n} α_{j} N_{j, k} (x) .$

1 Steps:

1.: Find $a_{0} = min {\{x_{i}\}}_{i = 1}^{N}$ , $b_{0} = max {\{x_{i}\}}_{i = 1}^{N}$ . Set $η_{0} = a_{0} - γ \cdot (b_{0} - a_{0}), η_{m + 1} = b_{0} + γ \cdot (b_{0} - a_{0}), γ = 0.01 .$
2.: Initialize $U = [\underset{k + 1}{\underset{⏟}{η_{0}, \dots, η_{0}}}, η_{1}, η_{2}, \dots, η_{m}, \underset{k + 1}{\underset{⏟}{η_{m + 1}, \dots, η_{m + 1}}}],$ where $η_{i} = η_{0} + i (η_{m + 1} - η_{0}) / (m + 1), i = 1, \dots, m$ .
3.: Apply (8) to compute the coefficients $α$ .
4.: Evaluate the error indicators $τ_{i}$ as (9).
5.: Apply Algorithm 1 to obtain a refined knot vector $U^{'}$ .
6.: If $U^{'} \neq U$ , update $U = U^{'}$ and return to Step 3.
7.: Set $\hat{f} (x | α)$ as (3). Output $\hat{f} (x | α)$ as an estimate of $f (x)$ .

Remark 1.

Compared with the case of a uniform B-spline, where the knot vectors are selected by an exhaustive search [31], our adaptive refinement strategy generated the knot vector automatically.

5. Numerical Experiments

In this section, we report the results of several numerical experiments. We start by introducing the different comparison measures in Section 5.1. Section 5.2 shows a comparison of the accuracy of nonuniform B-spline density estimators versus uniform B-spline density estimators. In Section 5.3, we compare the nonuniform B-spline density estimator to the existing kernel density estimators and orthogonal sequence estimators.

5.1. Comparison Measures

We used different measures to evaluate the quality of the estimators computed based on the samples.

The measured entropy (ME) of the samples given by the estimator, which is defined as (5).
The BIC score of the samples given by the estimator, which is defined as (4).

In addition, we also used the MAE and root-MSE to measure how close the estimation

{\hat{f}}_{X} (x | α)

was to the true density

f (x)

, where the

x_{i}

is the sample point:

The root mean square error (root-MSE) between the estimation $\hat{f} (x | α)$ and the true density $f (x)$ :

$\begin{matrix} root - MSE = \sqrt{\frac{1}{N} \sum_{i = 1}^{N} {[\hat{f} (x_{i} | α) - f (x_{i})]}^{2}}; \end{matrix}$

(10)
The mean absolute error (MAE) between the estimation $\hat{f} (x | α)$ and the true density $f (x)$ :

$\begin{matrix} MAE = \frac{1}{N} \sum_{i = 1}^{N} |\hat{f} (x_{i} | α) - f (x_{i})| . \end{matrix}$

(11)

5.2. Uniform B-Spline Estimators vs. Nonuniform B-Spline Estimators

First, we compared the uniform B-spline probability density estimator with the adaptive nonuniform B-spline probability density estimator; the generation of the nonuniform knot vector was described in Section 4.

Table 1 shows the name of the datasets, the probability distribution, and the approximation domain. The comparative experimental results of the uniform B-spline and the nonuniform B-spline are shown in Table 2, and the fitting results are shown in Figure 2.

When the sample size was fixed, the errors of the uniform B-spline and the nonuniform B-spline are compared in Table 2. The measured entropy (ME), the information entropy (H), the mean absolute error (MAE), the mean square error (root-MSE), and the Bayesian information criterion (BIC) scores are listed, which demonstrated that the adaptive nonuniform B-spline estimators usually outperformed the uniform B-spline estimators. In addition, compared to the uniform B-spline estimators, the adaptive nonuniform B-spline estimators were usually closer to the true density functions, which is shown in Figure 2.

When the sample size varied as

N = 50

, 100, 500, 1000, 5000, the root-MSE and MAE results are listed for the uniform B-spline estimators and the nonuniform estimators in Table 3. From the ratio of the MAE and root-MSE, we can see that the fitting results of the adaptive nonuniform B-spline outperformed those of the uniform B-spline.

5.3. Comparison with Orthogonal Sequence and Kernel Estimators

Table 4 and Table 5 show the results compared with the previously mentioned probability density function estimation methods, including the orthogonal sequence of [7,13], the kernel estimators using three strategies to the bandwidth selected, that is, the rule-of-thumb method, which is based on the asymptotic mean integrated square error (ROT) [38], the least squares cross-validation method (LCV) [38], and a method proposed by Hall et al. based on the straightforward idea of plugging estimates into the usual asymptotic representation for the optimal bandwidth but with two important modifications (HALL) [39]. The experimental results of the B-spline function estimation used the values of the MAE and root-MSE.

In Table 4, we observe that with the same sample size, the errors of the nonuniform B-spline were smaller. The errors of the root-MSE and MAE of the nonuniform B-spline were smaller than those of the other methods, which showed that the estimation effect of the adaptive nonuniform B-splines was better than that of the listed methods. In addition, the fitting results obtained by the nonuniform B-splines overcame the overfitting phenomenon of the uniform method.

In Table 5, the error analysis of different sample sizes for the nonuniform B-spline, orthogonal sequence, and kernel methods are listed. The experimental results showed that the errors of the nonuniform B-spline were smaller, and the errors became smaller with the increase in the sample size. It was also shown that the B-spline fitting method of the nonuniform knots generated by our adaptive strategy had a better fitting effect.

6. Conclusions

In this work, we introduced a novel density estimation with nonuniform B-splines. By introducing the error indicator attached to each interval for density estimation, we proposed an adaptive strategy to generate the nonuniform knot vector. The numerical experiments showed that, compared with the uniform B-spline, the local density estimation with nonuniform B-splines not only achieved better estimation results but also effectively alleviated the overfitting phenomenon caused by the uniform B-splines. The comparison with the existing estimation procedures, including the state-of-the-art kernel estimators, demonstrated the accuracy of our new method.

In the future, it would be interesting to extend the method considered in the paper to multivariate density cases. Another natural direction to pursue further is the fast automatic knot placement method via feature characterization from the samples, which can generate the nonuniform knot vector directly. We leave these topics for future research.

Author Contributions

Y.Z., M.Z., Q.N. and X.W. had equal contributions. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Natural Science Foundation of China No.122011292 and No.61772167.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Siegel, S. Nonparametric statistics. Am. Stat. 1957, 11, 13–19. [Google Scholar]
Gibbons, J.D.; Chakraborti, S. Nonparametric Statistical Inference; CRC Press: Boca Raton, FL, USA, 2014. [Google Scholar]
García, S.; Luengo, J.; Sáez, J.A.; López, V.; Herrera, F. A survey of discretization techniques: Taxonomy and empirical analysis in supervised learning. IEEE Trans. Knowl. Data Eng. 2013, 25, 734–750. [Google Scholar] [CrossRef]
Bhattacharya, A.; Dunson, D.B. Nonparametric Bayesian density estimation on manifolds with applications to planar shapes. Biometrika 2010, 97, 851–865. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Hall, P. Cross-validation and the smoothing of orthogonal series density estimators. J. Multivar. Anal. 1987, 21, 189–206. [Google Scholar] [CrossRef] [Green Version]
Dai, X.; Müller, H.-G.; Yao, F. Optimal bayes classifiers for functional data and density ratios. Biometrika 2017, 104, 545–560. [Google Scholar]
Leitao, Á.; Oosterlee, C.W.; Ortiz-Gracia, L.; Bohte, S.M. On the data-driven cos method. Appl. Math. Comput. 2018, 317, 68–84. [Google Scholar] [CrossRef] [Green Version]
Ait-Hennani, L.; Kaid, Z.; Laksaci, A.; Rachdi, M. Nonparametric estimation of the expected shortfall regression for quasi-associated functional data. Mathematics 2022, 10, 4508. [Google Scholar] [CrossRef]
Terrell, G.R.; Scott, D.W. Variable kernel density estimation. Ann. Stat. 1992, 1236–1265. [Google Scholar] [CrossRef]
Lamnii, A.; Nour, M.Y.; Zidna, A. A reverse non-stationary generalized b-splines subdivision scheme. Mathematics 2021, 9, 2628. [Google Scholar] [CrossRef]
Fan, J.; Gijbels, I. Local Polynomial Modelling and Its Applications; Routledge: London, UK, 2018. [Google Scholar]
Redner, R.A. Convergence rates for uniform B-spline density estimators part i: One dimension. SIAM J. Sci. Comput. 1999, 20, 1929–1953. [Google Scholar] [CrossRef]
Cui, Z.; Kirkby, J.L.; Nguyen, D. Nonparametric density estimation by b-spline duality. Econom. Theory 2020, 36, 250–291. [Google Scholar] [CrossRef]
Cui, Z.; Kirkby, J.L.; Nguyen, D. A data-driven framework for consistent financial valuation and risk measurement. Eur. J. Oper. Res. 2021, 289, 381–398. [Google Scholar] [CrossRef]
Cui, Z.; Kirkby, J.L.; Nguyen, D. Efficient simulation of generalized sabr and stochastic local volatility models based on markov chain approximations. Eur. J. Oper. Res. 2021, 290, 1046–1062. [Google Scholar] [CrossRef]
Kooperberg, C.; Stone, C.J. Comparison of parametric and bootstrap approaches to obtaining confidence intervals for logspline density estimation. J. Comput. Graph. Stat. 2004, 13, 106–122. [Google Scholar] [CrossRef]
Koo, J.-Y. Bivariate b-splines for tensor logspline density estimation. Comput. Stat. Data Anal. 1996, 21, 31–42. [Google Scholar] [CrossRef]
Gu, C. Smoothing spline density estimation: A dimensionless automatic algorithm. J. Am. Stat. Assoc. 1993, 88, 495–504. [Google Scholar] [CrossRef]
Eilers, P.H.; Marx, B.D. Flexible smoothing with b-splines and penalties. Stat. Sci. 1996, 11, 89–121. [Google Scholar] [CrossRef]
Papp, D.; Alizadeh, F. Shape-constrained estimation using nonnegative splines. J. Comput. Graph. Stat. 2014, 23, 211–231. [Google Scholar] [CrossRef]
Kirkby, J.L.; Leitao, Á.; Nguyen, D. Nonparametric density estimation and bandwidth selection with b-spline bases: A novel galerkin method. Comput. Stat. Data Anal. 2021, 159, 107202. [Google Scholar] [CrossRef]
Oliveira, M.; Crujeiras, R.M.; Rodríguez-Casal, A. A plug-in rule for bandwidth selection in circular density estimation. Comput. Stat. Data Anal. 2012, 56, 3898–3908. [Google Scholar] [CrossRef] [Green Version]
Boente, G.; Rodriguez, D. Robust bandwidth selection in semiparametric partly linear regression models: Monte carlo study and influential analysis. Comput. Stat. Data Anal. 2008, 52, 2808–2828. [Google Scholar] [CrossRef]
Hall, P.; Kang, K.-H. Bandwidth choice for nonparametric classification. Ann. Stat. 2005, 33, 284–306. [Google Scholar] [CrossRef]
Loader, C. Bandwidth selection: Classical or plug-in? Ann. Stat. 1999, 27, 415–438. [Google Scholar] [CrossRef]
Boor, C.D.; Boor, C.D. A Practical Guide to Splines; Springer: New York, NY, USA, 1978; Volume 27. [Google Scholar]
Farin, G. Curves and Surfaces for Computer-Aided Geometric Design: A Practical Guide; Elsevier: Amsterdam, The Netherlands, 2014. [Google Scholar]
Ezhov, N.; Neitzel, F.; Petrovic, S. Spline approximation, part 2: From polynomials in the monomial basis to b-splines—A derivation. Mathematics 2021, 9, 2198. [Google Scholar] [CrossRef]
Curry, H.B.; Schoenberg, I.J. On spline distributions and their limits-the polya distribution functions. Bull. Am. Math. Soc. 1947, 53, 1114. [Google Scholar]
Lyche, T.; Manni, C.; Speleers, H. Foundations of spline theory: B-splines, spline approximation, and hierarchical refinement. In Splines and PDEs: From Approximation Theory to Numerical Linear Algebra; Springer: Berlin/Heidelberg, Germany, 2018; pp. 1–76. [Google Scholar]
Zong, Z.; Lam, K. Estimation of complicated distributions using B-spline functions. Struct. Saf. 1998, 20, 341–355. [Google Scholar] [CrossRef]
López-Cruz, P.L.; Bielza, C.; Larranaga, P. Learning mixtures of polynomials of multidimensional probability densities from data using b-spline interpolation. Int. J. Approx. Reason. 2014, 55, 989–1010. [Google Scholar] [CrossRef]
Loock, W.V.; Pipeleers, G.; Schutter, J.D.; Swevers, J. A convex optimization approach to curve fitting with b-splines. IFAC Proc. Vol. 2011, 44, 2290–2295. [Google Scholar] [CrossRef] [Green Version]
Bastian, P.; Wittum, G. Adaptive multigrid methods: The ug concept. In Adaptive Methods—Algorithms, Theory and Applications; Springer: Berlin/Heidelberg, Germany, 1994; pp. 17–37. [Google Scholar]
Urth, R.V. A Review of a Posteriori Error Estimation and Adaptive Mesh-Refinement Techniques; BG Teubner: Leipzig, Germany, 1996. [Google Scholar]
Dörfel, M.R.; Jüttler, B.; Simeon, B. Adaptive isogeometric analysis by local h-refinement with t-splines. Comput. Methods Appl. Mech. Eng. 2010, 199, 264–275. [Google Scholar] [CrossRef] [Green Version]
Morin, P.; Nochetto, R.H.; Siebert, K.G. Data oscillation and convergence of adaptive fem. SIAM J. Numer. Anal. 2000, 38, 466–488. [Google Scholar] [CrossRef]
Węglarczyk, S. Kernel density estimation and its application. In ITM Web of Conferences; EDP Sciences: Les Ulis, France, 2018; p. 00037. [Google Scholar]
Troudi, M.; Alimi, A.M.; Saoudi, S. Analytical plug-in method for kernel density estimator applied to genetic neutrality study. EURASIP J. Adv. Signal Process. 2008, 2008, 1–8. [Google Scholar] [CrossRef]

Figure 1. B-spline basis functions defined in the interval

[0, 10]

. (a) Knot vector

U = [0, 0, 0, 1, 2, 3, 5, 8, 9, 10, 10, 10]

. (b) Knot vector

U = [0, 0, 0, 0, 1, 2, 3, 5, 8, 9, 10, 10, 10, 10]

.

Figure 1. B-spline basis functions defined in the interval

[0, 10]

. (a) Knot vector

U = [0, 0, 0, 1, 2, 3, 5, 8, 9, 10, 10, 10]

. (b) Knot vector

U = [0, 0, 0, 0, 1, 2, 3, 5, 8, 9, 10, 10, 10, 10]

.

Figure 2. Density estimates use uniform and nonuniform B-splines. The blue rectangle represents the histogram, the blue solid line represents the true density function, the yellow dashed line represents the uniform B-spline function, and the red dotted line represents the adaptive nonuniform B-spline function. The knots of the nonuniform B-splines are marked as asterisks along the horizontal axis.

Table 1. Probability density functions.

Name	Distribution	Domain
Gauss	$N (0, 1)$	$[- 3, 3]$
Exp	Exp $(1)$	$[0, 3]$
Chisq	$χ^{2} (7)$	$[0, 25])$
MixGauss	$0.3 N (- 0.25, 0.33) + 0.5 N (3.25, 1.0)$	$[- 1.0, 7.0]$
Mix1d	$0.8 χ^{2} (3.00) + 0.2 N (7.00, 1.00)$	$[0.00, 10.00]$
MixGauss2	$0.33 N (3.0, {1.0}^{2}) + 0.33 N (8.0, {0.33}^{2}) + 0.33 N (10, {0.11}^{2})$	$[0.0, 10.0]$

Table 2. The goodness of fit for the data using the uniform B-spline and nonuniform B-spline methods (The sample size is 1000).

	Gauss	Exp	Chisq	MixGauss	Mix1d	MixGauss2
	uniform B-spline
ME	1.4152	0.7975	2.6337	1.6083	2.1494	1.8671
H	1.3993	0.7391	2.6151	0.8857	2.1060	0.4424
root-MSE	0.0632	0.0360	0.0032	0.1245	0.0044	0.1766
MAE	0.0163	0.0321	0.0027	0.1116	0.0048	0.1506
BIC $(\times 10^{3})$	2.8749	1.6417	5.3196	3.2501	4.3413	3.7524
	non-uniform B-spline
ME	1.4202	0.7991	2.6352	1.5978	2.1500	1.8855
H	1.3962	0.7381	2.6144	0.8854	2.1039	0.4615
root-MSE	0.0032	0.0360	0.0026	0.1216	0.0032	0.1780
MAE	0.0046	0.0313	0.0022	0.1109	0.0032	0.1506
BIC $(\times 10^{3})$	2.8836	1.6443	5.3206	3.2346	4.3438	3.7578

Table 3. Uniform vs. non-uniform

Ratio - root - MSE = {root - MSE}_{non - uniform} / {root - MSE}_{uniform}

. The goodness of fit for data using the uniform B-spline and nonuniform B-spline with the different sample sizes.

Table 3. Uniform vs. non-uniform

Ratio - root - MSE = {root - MSE}_{non - uniform} / {root - MSE}_{uniform}

. The goodness of fit for data using the uniform B-spline and nonuniform B-spline with the different sample sizes.

N	Case	Uniform B-spline		Nonuniform B-spline		Ratio-root- MSE	Ratio- MAE
N	Case	root-MSE	MAE	root-MSE	MAE	Ratio-root- MSE	Ratio- MAE
50	Gauss	0.1308	0.0737	0.1140	0.0804	0.8719	1.0909
100		0.0794	0.0701	0.0714	0.0592	0.8997	0.8445
500		0.0412	0.0340	0.0387	0.0310	0.9393	0.9118
1000		0.0200	0.0163	0.0071	0.0046	0.355	0.2822
5000		0.0100	0.0063	0.0071	0.0039	0.7143	0.6190
50	Exp	0.1612	0.1339	0.1411	0.1127	0.8749	0.8417
100		0.1625	0.1077	0.1507	0.1142	0.92728	1.0604
500		0.0458	0.0393	0.0412	0.0375	0.8997	0.9542
1000		0.0361	0.0321	0.0361	0.0313	1.0000	0.9751
5000		0.0224	0.0205	0.0245	0.0216	1.0954	1.0537
50	Chisq	0.0305	0.0266	0.0879	0.0231	2.8820	0.8684
100		0.0208	0.0178	0.0643	0.0172	3.0913	0.9663
500		0.0106	0.0091	0.0106	0.0091	1.0000	1.0000
1000		0.0032	0.0027	0.0088	0.0075	2.7500	2.7778
5000		0.0115	0.0098	0.0115	0.0098	1.0000	1.0000
50	MixGauss	0.1288	0.1074	0.1327	0.1078	1.0303	0.0037
100		0.1187	0.1091	0.1170	0.1085	0.9857	0.9945
500		0.1200	0.1097	0.1196	0.1100	0.9967	1.0027
1000		0.1245	0.1116	0.1217	0.1109	0.9775	0.9937
5000		0.1175	0.1059	0.1158	0.1045	0.9855	0.9868
50	Mix1d	0.0906	0.0724	0.1039	0.0747	1.1476	1.0318
100		0.0866	0.0526	0.0663	0.0515	0.7659	0.9791
500		0.0173	0.0131	0.0173	0.0136	1.0000	1.0382
1000		0.0053	0.0048	0.0044	0.0032	0.8148	0.6667
5000		0.0077	0.0028	0.0063	0.0028	0.8182	1.0000
50	MixGauss2	0.1780	0.1588	0.1766	0.1578	0.9921	0.9937
100		0.1929	0.1735	0.1881	0.1688	0.9951	0.9729
500		0.1709	0.1537	0.1685	0.1498	0.9860	0.9746
1000		0.1766	0.1506	0.1780	0.1506	1.0079	1.000
5000		0.1744	0.1540	0.1735	0.1514	1.9948	0.9831

Table 4. The goodness of fit for data using the nonuniform B-spline methods, orthogonal sequence, and kernel estimators (The number of sample points is 1000).

	Gauss	Exp	Chisq	MixGauss	Mix1d	MixGauss2
	Non-uniform B-spline
root-MSE	0.0566	0.0361	0.0026	0.1217	0.0032	0.1780
MAE	0.0046	0.0313	0.0022	0.1109	0.0032	0.1506
	Orthogonal sequence
root-MSE	0.1442	0.1860	0.0097	0.1204	0.0332	0.1020
MAE	0.1329	0.1508	0.0087	0.1035	0.0280	0.0958
	kernal ROT
root-MSE	0.2848	0.5742	0.0217	0.1273	0.0843	0.0309
MAE	0.2655	0.5084	0.0198	0.1163	0.0779	0.0280
	kernal LCV
root-MSE	0.2871	0.5896	0.0574	0.1364	0.1292	0.0539
MAE	0.2676	0.5229	0.0528	0.1247	0.1214	0.0479
	kernal HALL
root-MSE	0.3003	0.5887	0.0831	0.1442	0.1285	0.0539
MAE	0.2802	0.5222	0.0769	0.1317	0.1204	0.0481

Table 5. The goodness of fit using the nonuniform B-spline, orthogonal sequence, and kernal estimators with the different sample sizes.

N		Gauss	Exp	Chisq	MixGauss	Mix1d	MixGauss2
Nonuniform B-spline
50	root-MSE	0.1140	0.1411	0.0879	0.1327	0.1039	0.1766
	MAE	0.0804	0.1127	0.0231	0.1078	0.0747	0.1578
100	root-MSE	0.0714	0.1507	0.0643	0.1170	0.0663	0.1881
	MAE	0.0592	0.1142	0.0172	0.1085	0.0515	0.1688
500	root-MSE	0.0387	0.0412	0.0106	0.1196	0.0173	0.1685
	MAE	0.0310	0.0375	0.0091	0.1100	0.0136	0.1498
1000	root-MSE	0.0043	0.0361	0.0026	0.1217	0.0032	0.1780
	MAE	0.0046	0.0313	0.0022	0.1109	0.0032	0.1506
5000	root-MSE	0.0063	0.0245	0.0115	0.1158	0.0023	0.1735
	MAE	0.0039	0.0216	0.0098	0.1045	0.0028	0.1514
Orthogonal sequence
50	root-MSE	0.1149	0.2006	0.0159	0.0837	0.0265	0.0640
	MAE	0.1097	0.1618	0.0136	0.0696	0.0204	0.0585
100	root-MSE	0.1175	0.2102	0.0216	0.1095	0.0265	0.0207
	MAE	0.1087	0.1602	0.0165	0.0952	0.0220	0.0169
500	root-MSE	0.1530	0.1817	0.0056	0.0608	0.0436	0.0943
	MAE	0.1412	0.1484	0.0049	0.0137	0.0391	0.0844
1000	root-MSE	0.1442	0.1860	0.0097	0.1204	0.0332	0.1020
	MAE	0.1329	0.1508	0.0087	0.1035	0.0280	0.0958
5000	root-MSE	0.1490	0.1819	0.0132	0.1253	0.0346	0.1109
	MAE	0.1369	0.1470	0.0117	0.1083	0.0302	0.0990
kernal ROT
50	root-MSE	0.2577	0.5544	0.1162	0.0707	0.0548	0.0469
	MAE	0.2362	0.4932	0.1111	0.0595	0.0496	0.0438
100	root-MSE	0.2390	0.5840	0.0748	0.1005	0.0436	0.0235
	MAE	0.2163	0.5223	0.0719	0.0908	0.0346	0.0202
500	root-MSE	0.2835	0.5700	0.0096	0.1187	0.0721	0.0216
	MAE	0.2663	0.5059	0.0078	0.1086	0.0673	0.0193
1000	root-MSE	0.2848	0.5742	0.0217	0.1273	0.0843	0.0309
	MAE	0.2655	0.5084	0.0198	0.1163	0.0779	0.0280
5000	root-MSE	0.2926	0.5778	0.0548	0.1338	0.1058	0.0424
	MAE	0.2714	0.5095	0.0505	0.1221	0.0987	0.0385
kernal LCV
50	root-MSE	0.2366	0.5916	0.1916	0.0566	0.0548	0.0632
	MAE	0.2161	0.5209	0.1842	0.471	0.0465	0.595
100	root-MSE	0.2332	0.6084	0.0200	0.1127	0.0964	0.0480
	MAE	0.2107	0.5459	0.0174	0.1024	0.0888	0.4310
500	root-MSE	0.2751	0.5902	0.0244	0.1356	0.1265	0.0500
	MAE	0.2584	0.5251	0.0207	0.1242	0.1191	0.0453
1000	root-MSE	0.2871	0.5896	0.0574	0.1364	0.1292	0.0539
	MAE	0.2676	0.5229	0.0528	0.1247	0.1214	0.0479
5000	root-MSE	0.2888	0.5868	0.0787	0.1393	0.1330	0.0548
	MAE	0.2679	0.5178	0.0730	0.1268	0.1244	0.0496
kernal HALL
50	root-MSE	0.2848	0.5873	0.0332	0.1158	0.0964	0.0314
	MAE	0.2619	0.5264	0.0239	0.1020	0.0867	0.0273
100	root-MSE	0.2729	0.6083	0.0539	0.1338	0.1068	0.0436
	MAE	0.2498	0.5458	0.0485	0.1216	0.0992	0.0397
500	root-MSE	0.3030	0.5886	0.0762	0.1421	0.1277	0.0520
	MAE	0.2846	0.5236	0.0701	0.1300	0.1200	0.0471
1000	root-MSE	0.3003	0.5887	0.0831	0.1442	0.1285	0.0539
	MAE	0.2802	0.5222	0.0769	0.1317	0.1204	0.0481
5000	root-MSE	0.3013	0.5863	0.0883	0.1439	0.1319	0.0557
	MAE	0.2796	0.5174	0.0816	0.1313	0.1231	0.0505

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhao, Y.; Zhang, M.; Ni, Q.; Wang, X. Adaptive Nonparametric Density Estimation with B-Spline Bases. Mathematics 2023, 11, 291. https://doi.org/10.3390/math11020291

AMA Style

Zhao Y, Zhang M, Ni Q, Wang X. Adaptive Nonparametric Density Estimation with B-Spline Bases. Mathematics. 2023; 11(2):291. https://doi.org/10.3390/math11020291

Chicago/Turabian Style

Zhao, Yanchun, Mengzhu Zhang, Qian Ni, and Xuhui Wang. 2023. "Adaptive Nonparametric Density Estimation with B-Spline Bases" Mathematics 11, no. 2: 291. https://doi.org/10.3390/math11020291

APA Style

Zhao, Y., Zhang, M., Ni, Q., & Wang, X. (2023). Adaptive Nonparametric Density Estimation with B-Spline Bases. Mathematics, 11(2), 291. https://doi.org/10.3390/math11020291

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Adaptive Nonparametric Density Estimation with B-Spline Bases

Abstract

1. Introduction

2. B-Splines

3. Density Estimation with B-Splines

3.1. Selecting the Degree and the Knot Vector

3.2. Computing the Coefficients

4. Knot Refinement

4.1. A Residual-Based Posteriori Error Estimator Based on B-Splines

4.2. Adaptive Refinement Strategy

5. Numerical Experiments

5.1. Comparison Measures

5.2. Uniform B-Spline Estimators vs. Nonuniform B-Spline Estimators

5.3. Comparison with Orthogonal Sequence and Kernel Estimators

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI