#
Computationally Efficient Bootstrap Expressions for Bandwidth Selection in Nonparametric Curve Estimation^{ †}

^{*}

^{†}

## Abstract

**:**

## 1. Introduction

## 2. Nonparametric Density Estimation

## 3. Nonparametric Hazard Rate Estimation

## 4. Simulation Results

**Density estimation:**An AR(1) model given by ${X}_{t}=-0.6{X}_{t-1}+0.8{a}_{t}$, where ${a}_{t}\stackrel{d}{=}N(0,1)$.**Hazard rate estimation:**A Gumbel model such that $f\left(x\right)={e}^{-x}{e}^{-{e}^{-x}},\forall x\ge 0$.

## 5. Discussion

## Funding

## Conflicts of Interest

## Abbreviations

MISE | Mean integrated squared error |

ISE | Integrated squared error |

SSB | Smoothed stationary bootstrap |

SMBB | Smoothed moving blocks bootstrap |

iid | Independent and identically distributed |

${h}_{DO}$ | DO-validation bandwidth selector for hazard rate estimation (see [10]) |

${h}_{GCM}^{*}$ | González-Manteiga, Cao, Marron bandwidth selector for hazard rate estimation (see [11]) |

${h}_{PI}$ | Plug-in bandwidth selector for bandwidth selection with dependent data (see [12]) |

${h}_{C{V}_{l}}$ | Leave-$(2l+1)$-out cross-validation for density estimation (see [13]) |

${h}_{SMCV}$ | Modified cross validation for density estimation with dependent data (see [8]) |

${h}_{PCV}$ | Penalized cross validation for density estimation with dependent data (see [8]) |

${h}_{CV}$ | Cross validation bandwidth selector for hazard rate estimation (see [14]) |

${h}_{MISE}$ | Bandwidth selector which minimizes the theoretical MISE(h) |

## Appendix A

**Smoothed stationary bootstrap**

- Draw ${X}_{1}^{*\left(SB\right)}$ from ${F}_{n}$, the empirical distribution function of the sample.
- Define ${X}_{1}^{*}={X}_{1}^{*\left(SB\right)}+g{U}_{1}^{*}$, where ${U}_{1}^{*}\phantom{\rule{0.166667em}{0ex}}$ has been drawn with density K and independently from ${X}_{1}^{*\left(SB\right)}$.
- Assume we have already drawn ${X}_{1}^{*},\dots ,{X}_{i}^{*}$ (and, consequently, ${X}_{1}^{*\left(SB\right)},\dots ,{X}_{i}^{*\left(SB\right)}$) and consider the index j, for which ${X}_{i}^{*\left(SB\right)}={X}_{j}$. We define a binary auxiliary random variable ${I}_{i+1}^{*}$, such that ${P}^{*}\left({I}_{i+1}^{*}=1\right)=1-p$ and ${P}^{*}\left({I}_{i+1}^{*}=0\right)=p$. We assign ${X}_{i+1}^{*\left(SB\right)}={X}_{\left(j\phantom{\rule{3.33333pt}{0ex}}mod\phantom{\rule{0.277778em}{0ex}}n\right)+1}$ whenever ${I}_{i+1}^{*}=1$ and we use the empirical distribution function for${X}_{i+1}^{*\left(SB\right)}{|}_{{I}_{i+1}^{*}=0}$, where $\phantom{\rule{3.33333pt}{0ex}}mod\phantom{\rule{0.277778em}{0ex}}$ stands for the modulus operator.
- Once drawn ${X}_{i+1}^{*\left(SB\right)}$, we define ${X}_{i+1}^{*}={X}_{i+1}^{*\left(SB\right)}+g{U}_{i+1}^{*}$, where, again, ${U}_{i+1}^{*}\phantom{\rule{0.166667em}{0ex}}$ has been drawn from the density K and independently from ${X}_{i+1}^{*\left(SB\right)}$.

**Smoothed moving blocks bootstrap**

- Fix the block length, $b\in \mathbb{N}$, and define $k={min}_{\ell \in \mathbb{N}}\ell \ge \frac{n}{b}$
- Define:$${B}_{i,b}=({X}_{i},{X}_{i+1},\dots ,{X}_{i+b-1})$$
- Draw ${\xi}_{1},{\xi}_{2},\dots ,{\xi}_{k}$ with uniform discrete distribution on $\{{B}_{1},{B}_{2},\dots ,{B}_{q}\}$, with $q=n-b+1$
- Define ${X}_{1}^{*\left(MBB\right)},\dots ,{X}_{n}^{*\left(MBB\right)}$ as the first n components of$$({\xi}_{1,1},{\xi}_{1,2},\dots ,{\xi}_{1,b},{\xi}_{2,1},{\xi}_{2,2}\dots ,{\xi}_{2,b},\dots ,{\xi}_{k,1},{\xi}_{k,2},\dots ,{\xi}_{k,b})$$
- Define ${X}_{i}^{*}={X}_{i}^{*\left(MBB\right)}+g{U}_{i}^{*}$, where ${U}_{i}^{*}$ has been drawn with density K and independently from ${X}_{i}^{*\left(MBB\right)}$, for all $i=1,2,\dots ,n$

## References

- Silverman, B.W. Density Estimation for Statistics and Data Analysis; Chapman & Hall: London, UK, 1986. [Google Scholar]
- Devroye, L. A Course in Density Estimation; Birkhauser: Boston, MA, USA, 1987. [Google Scholar]
- Watson, G.S.; Leadbetter, M.R. Hazard analysis I. Biometrika
**1964a**, 51, 175–184. [Google Scholar] [CrossRef] - Watson, G.S.; Leadbetter, M.R. Hazard analysis II. Sankhyā Ser. A
**1964b**, 26, 101–116. [Google Scholar] - Parzen, E. Estimation of a probability density-function and mode. Ann. Stat.
**1962**, 33, 1065–1076. [Google Scholar] [CrossRef] - Rosenblatt, M. Estimation of a probability density-function and mode. Ann. Stat.
**1956**, 27, 832–837. [Google Scholar] [CrossRef] - Barbeito, I.; Cao, R. Smoothed stationary bootstrap bandwidth selection for density estimation with dependent data. Comput. Stat. Data Anal.
**2016**, 104, 130–147. [Google Scholar] [CrossRef] - Barbeito, I.; Cao, R. A review and some new proposals for bandwidth selection in nonparametric density estimation for dependent data. In From Statistics to Mathematical Finance: Festschrift in Honour of Winfried Stute; Ferger, D., González Manteiga, W., Schmidt, T., Wang, J.L., Eds.; Springer International Publishing: Cham, Switzerland, 2017; pp. 173–208. ISBN 978-3-319-50986-0. [Google Scholar]
- Barbeito, I.; Cao, R. Smoothed bootstrap bandwidth selection for nonparametric hazard rate estimation. Preprint
**2018**. [Google Scholar] [CrossRef] - Gámiz, M.L.; Mammen, E.; Martínez-Miranda, M.D.; Nielsen, J.P. Double one-sided cross-validation of local linear hazards. J. R. Stat. Soc. Ser. B Stat.
**2016**, 78, 775–779. [Google Scholar] [CrossRef] - González-Manteiga, W.; Cao, R.; Marron, J.S. Bootstrap Selection of the Smoothing Parameter in Nonparametric Hazard Rate Estimation. J. Am. Stat. Assoc.
**1996**, 91, 1130–1140. [Google Scholar] - Hall, P.; Lahiri, S.N.; Truong, Y.K. On bandwidth choice for density estimation with dependent data. Ann. Stat.
**1995**, 23, 2241–2263. [Google Scholar] [CrossRef] - Hart, J.D.; Vieu, P. Data-driven bandwidth choice for density estimation based on dependent data. Ann. Stat.
**1990**, 18, 873–890. [Google Scholar] [CrossRef] - Patil, P.N. On the Least Squares Cross-Validation Bandwidth in Hazard Rate Estimation. Ann. Stat.
**1993**, 21, 1792–1810. [Google Scholar] [CrossRef]

**Figure 1.**Boxplot of $log\left(MISE\left(\widehat{h}\right)/MISE\left({h}_{MISE}\right)\right)$, $n=100$, where $\widehat{h}={h}_{C{V}_{l}}$ (first box), ${h}_{SMCV}$ (second box), ${h}_{PCV}$ (third box), ${h}_{SSB}^{*}$ (fourth box), ${h}_{SMBB}^{*}$ (fifth box) and ${h}_{PI}$ (sixth box).

**Table 1.**Mean and median of $ISE\left(\widehat{h}\right)$, $n=100$, where $\widehat{h}={h}_{CV}$ (third column), ${h}_{DO}$ (fourth column), ${h}_{BOOT1}$ (fifth column), ${h}_{BOOT2}$ (sixth column) and ${h}_{GCM}^{*}$ (seventh column).

CV | DO | BOOT1 | BOOT2 | GCM | ||
---|---|---|---|---|---|---|

Gumbel model | Mean | $0.1656$ | $0.01651$ | $0.02914$ | $0.02882$ | $0.03595$ |

Median | $0.15527$ | $0.01037$ | $0.012844$ | $0.01282$ | $0.01739$ |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Barbeito, I.; Cao, R. Computationally Efficient Bootstrap Expressions for Bandwidth Selection in Nonparametric Curve Estimation. *Proceedings* **2018**, *2*, 1164.
https://doi.org/10.3390/proceedings2181164

**AMA Style**

Barbeito I, Cao R. Computationally Efficient Bootstrap Expressions for Bandwidth Selection in Nonparametric Curve Estimation. *Proceedings*. 2018; 2(18):1164.
https://doi.org/10.3390/proceedings2181164

**Chicago/Turabian Style**

Barbeito, Inés, and Ricardo Cao. 2018. "Computationally Efficient Bootstrap Expressions for Bandwidth Selection in Nonparametric Curve Estimation" *Proceedings* 2, no. 18: 1164.
https://doi.org/10.3390/proceedings2181164