Computationally Efficient Bootstrap Expressions for Bandwidth Selection in Nonparametric Curve Estimation^{ †}

## Abstract

## 1. Introduction

## 2. Nonparametric Density Estimation

## 3. Nonparametric Hazard Rate Estimation

## 4. Simulation Results

**Density estimation:**An AR(1) model given by ${X}_{t}=-0.6{X}_{t-1}+0.8{a}_{t}$, where ${a}_{t}\stackrel{d}{=}N(0,1)$.**Hazard rate estimation:**A Gumbel model such that $f\left(x\right)={e}^{-x}{e}^{-{e}^{-x}},\forall x\ge 0$.

## 5. Discussion

## Funding

## Conflicts of Interest

## Abbreviations

MISE | Mean integrated squared error |

ISE | Integrated squared error |

SSB | Smoothed stationary bootstrap |

SMBB | Smoothed moving blocks bootstrap |

iid | Independent and identically distributed |

${h}_{DO}$ | DO-validation bandwidth selector for hazard rate estimation (see [10]) |

${h}_{GCM}^{*}$ | González-Manteiga, Cao, Marron bandwidth selector for hazard rate estimation (see [11]) |

${h}_{PI}$ | Plug-in bandwidth selector for bandwidth selection with dependent data (see [12]) |

${h}_{C{V}_{l}}$ | Leave-$(2l+1)$-out cross-validation for density estimation (see [13]) |

${h}_{SMCV}$ | Modified cross validation for density estimation with dependent data (see [8]) |

${h}_{PCV}$ | Penalized cross validation for density estimation with dependent data (see [8]) |

${h}_{CV}$ | Cross validation bandwidth selector for hazard rate estimation (see [14]) |

${h}_{MISE}$ | Bandwidth selector which minimizes the theoretical MISE(h) |

## Appendix A

**Smoothed stationary bootstrap**

- Draw ${X}_{1}^{*\left(SB\right)}$ from ${F}_{n}$, the empirical distribution function of the sample.
- Define ${X}_{1}^{*}={X}_{1}^{*\left(SB\right)}+g{U}_{1}^{*}$, where ${U}_{1}^{*}\phantom{\rule{0.166667em}{0ex}}$ has been drawn with density K and independently from ${X}_{1}^{*\left(SB\right)}$.
- Assume we have already drawn ${X}_{1}^{*},\dots ,{X}_{i}^{*}$ (and, consequently, ${X}_{1}^{*\left(SB\right)},\dots ,{X}_{i}^{*\left(SB\right)}$) and consider the index j, for which ${X}_{i}^{*\left(SB\right)}={X}_{j}$. We define a binary auxiliary random variable ${I}_{i+1}^{*}$, such that ${P}^{*}\left({I}_{i+1}^{*}=1\right)=1-p$ and ${P}^{*}\left({I}_{i+1}^{*}=0\right)=p$. We assign ${X}_{i+1}^{*\left(SB\right)}={X}_{\left(j\phantom{\rule{3.33333pt}{0ex}}mod\phantom{\rule{0.277778em}{0ex}}n\right)+1}$ whenever ${I}_{i+1}^{*}=1$ and we use the empirical distribution function for${X}_{i+1}^{*\left(SB\right)}{|}_{{I}_{i+1}^{*}=0}$, where $\phantom{\rule{3.33333pt}{0ex}}mod\phantom{\rule{0.277778em}{0ex}}$ stands for the modulus operator.
- Once drawn ${X}_{i+1}^{*\left(SB\right)}$, we define ${X}_{i+1}^{*}={X}_{i+1}^{*\left(SB\right)}+g{U}_{i+1}^{*}$, where, again, ${U}_{i+1}^{*}\phantom{\rule{0.166667em}{0ex}}$ has been drawn from the density K and independently from ${X}_{i+1}^{*\left(SB\right)}$.

**Smoothed moving blocks bootstrap**

- Fix the block length, $b\in \mathbb{N}$, and define $k={min}_{\ell \in \mathbb{N}}\ell \ge \frac{n}{b}$
- Define:$${B}_{i,b}=({X}_{i},{X}_{i+1},\dots ,{X}_{i+b-1})$$
- Draw ${\xi}_{1},{\xi}_{2},\dots ,{\xi}_{k}$ with uniform discrete distribution on $\{{B}_{1},{B}_{2},\dots ,{B}_{q}\}$, with $q=n-b+1$
- Define ${X}_{1}^{*\left(MBB\right)},\dots ,{X}_{n}^{*\left(MBB\right)}$ as the first n components of$$({\xi}_{1,1},{\xi}_{1,2},\dots ,{\xi}_{1,b},{\xi}_{2,1},{\xi}_{2,2}\dots ,{\xi}_{2,b},\dots ,{\xi}_{k,1},{\xi}_{k,2},\dots ,{\xi}_{k,b})$$
- Define ${X}_{i}^{*}={X}_{i}^{*\left(MBB\right)}+g{U}_{i}^{*}$, where ${U}_{i}^{*}$ has been drawn with density K and independently from ${X}_{i}^{*\left(MBB\right)}$, for all $i=1,2,\dots ,n$

## References

**Figure 1.**Boxplot of $log\left(MISE\left(\widehat{h}\right)/MISE\left({h}_{MISE}\right)\right)$, $n=100$, where $\widehat{h}={h}_{C{V}_{l}}$ (first box), ${h}_{SMCV}$ (second box), ${h}_{PCV}$ (third box), ${h}_{SSB}^{*}$ (fourth box), ${h}_{SMBB}^{*}$ (fifth box) and ${h}_{PI}$ (sixth box).

**Table 1.**Mean and median of $ISE\left(\widehat{h}\right)$, $n=100$, where $\widehat{h}={h}_{CV}$ (third column), ${h}_{DO}$ (fourth column), ${h}_{BOOT1}$ (fifth column), ${h}_{BOOT2}$ (sixth column) and ${h}_{GCM}^{*}$ (seventh column).

CV | DO | BOOT1 | BOOT2 | GCM | ||
---|---|---|---|---|---|---|

Gumbel model | Mean | $0.1656$ | $0.01651$ | $0.02914$ | $0.02882$ | $0.03595$ |

Median | $0.15527$ | $0.01037$ | $0.012844$ | $0.01282$ | $0.01739$ |

