Kernel Two-Sample and Independence Tests for Nonstationary Random Processes^{ †}

## Abstract

## 1. Introduction

## 2. Related Work

## 3. mmd and hsic for Nonstationary Random Processes

#### 3.1. Notation and Assumptions

#### 3.2. mmd for Nonstationary Random Processes

#### 3.3. hsic for Nonstationary Random Processes

#### 3.4. Maximising the Test Power

## 4. Experimental Results on Synthetic Data

#### 4.1. Homogeneity Tests with mmd

#### 4.1.1. Setup

- Mean shift:${\mu}_{\mathbf{X}}\left(t\right)=t$ and ${\mu}_{\mathbf{Y}}\left(t\right)=t+{\delta}_{\mu}{t}^{3}$. The basis coefficients are sampled as ${{\xi}_{\mathbf{X}}}_{i,1},\phantom{\rule{0.166667em}{0ex}}{{\xi}_{\mathbf{Y}}}_{i,1}\sim \mathcal{N}(0,10)$ and ${{\xi}_{\mathbf{X}}}_{i,2},\phantom{\rule{0.166667em}{0ex}}{{\xi}_{\mathbf{Y}}}_{i,2}\sim \mathcal{N}(0,5)$, and the additive noises are sampled as ${{\u03f5}_{\mathbf{X}}}_{i,t},\phantom{\rule{0.166667em}{0ex}}{{\u03f5}_{\mathbf{Y}}}_{i,t}\sim \mathcal{N}(0,0.25)$.
- Variance shift: We take ${\mu}_{\mathbf{X}}\left(t\right)={\mu}_{\mathbf{Y}}\left(t\right)=0$, and introduce a shift in variance in the first basis function coefficients via ${{\xi}_{\mathbf{X}}}_{i,1}\sim \mathcal{N}(0,10)$ and ${{\xi}_{\mathbf{Y}}}_{i,1}\sim \mathcal{N}(0,10+{\delta}_{\sigma})$. The second coefficients are sampled as ${{\xi}_{\mathbf{X}}}_{i,2},{{\xi}_{\mathbf{Y}}}_{i,2}\sim \mathcal{N}(0,5)$, and the noises as ${{\u03f5}_{\mathbf{X}}}_{i,t},\phantom{\rule{0.166667em}{0ex}}{{\u03f5}_{\mathbf{Y}}}_{i,t}\sim \mathcal{N}(0,0.25)$.

#### 4.1.2. Baseline Results without Test Power Optimisation

#### 4.1.3. Results of the Optimised Test

#### 4.2. Independence Tests with hsic

#### 4.2.1. Setup

- Linear dependence:$\mathbf{X}$ is generated as in (9) with ${\mu}_{\mathbf{X}}\left(t\right)=t$, basis coefficients ${{\xi}_{\mathbf{X}}}_{i,1}\sim \mathcal{N}(0,10)$, ${{\xi}_{\mathbf{X}}}_{i,2}\sim \mathcal{N}(0,5)$, and noise ${{\u03f5}_{\mathbf{X}}}_{i,t}\sim \mathcal{N}(0,0.25)$. The samples of the second process are $\mathbf{Y}={\{{x}_{i,1}+{\u03f5}_{i}\}}_{i=1}^{m}$ where ${\u03f5}_{i}\sim \mathcal{N}(0,1)$, as in [18].
- Dependence through a shared coefficient:$\mathbf{X}$ and $\mathbf{Y}$ are generated as in (9) with ${\mu}_{\mathbf{X}}\left(t\right)={\mu}_{\mathbf{Y}}\left(t\right)=t$ and independently sampled ${{\xi}_{\mathbf{X}}}_{i,1}$, ${{\xi}_{\mathbf{Y}}}_{i,1}$, ${{\u03f5}_{\mathbf{X}}}_{i,t}$, ${{\u03f5}_{\mathbf{Y}}}_{i,t}$ as in the mean shift experiments of Section 4.1, but where the stochastic processes now share the second basis function coefficient: ${{\xi}_{\mathbf{X}}}_{i,2}={{\xi}_{\mathbf{Y}}}_{i,2}$.
- Dependence through rotation: We start by generating independent ${\mathbf{X}}^{\left(0\right)}$ and ${\mathbf{Y}}^{\left(0\right)}$ as in (9) with ${\mu}_{\mathbf{X}}\left(t\right)={\mu}_{\mathbf{Y}}\left(t\right)=t$ and ${{\u03f5}_{\mathbf{X}}}_{i,t},{{\u03f5}_{\mathbf{Y}}}_{i,t}\sim \mathcal{N}(0,0.25)$, but with ${{\xi}_{\mathbf{X}}}_{i,k}$ and ${{\xi}_{\mathbf{Y}}}_{i,k}$ drawn from: (i) student-t, (ii) uniform, or (iii) exponential distributions [28] (Table 3). We next multiply $({\mathbf{X}}^{\left(0\right)},{\mathbf{Y}}^{\left(0\right)})$ by a $2\times 2$ rotation matrix $R\left(\theta \right)$ with $\theta \in [0,\pi /4]$ to generate new rotated samples $(\mathbf{X},\mathbf{Y})$, which we then test for independence. Clearly, for $\theta =0$ our samples $(\mathbf{X},\mathbf{Y})$ are independent and as $\theta $ is increased their dependence becomes easier to detect (see [7] (Section 4) and Figure A3 for implementation details).

#### 4.2.2. Baseline Results without Test Power Optimisation

#### 4.2.3. Results of the Optimised Test

## 5. Application to a Socioeconomic Dataset

## 6. Discussion and Conclusions

## Appendix A

#### Appendix A.1. Results for Realisations with Varying Number of Time Points, T

**Figure A1.**Results of mmd-based homogeneity test with $T=[5,10,25,50,100]$: Percentage of rejected ${H}_{0}$ for mean shift (

**left**) and variance shift (

**right**) for sample sizes $m=n=100$ and T discrete time points in $d=1$ dimensions.

**Figure A2.**Results of the hsic-based independence test: Test power for linear dependence (

**left**) and dependence through shared coefficient (

**right**) as sample size is varied for various numbers of time points $T=[5,10,25,50,100]$.

#### Appendix A.2. Test Power Maximisation

**Table A1.**Linear search spaces for bandwidth $\sigma $ in mmd mean (

**left**) and variance (

**right**) shift experiments.

${\mathbf{\delta}}_{\mathbf{\mu}}$ | 0–2 | 2.25–3 | 3.25–5 | 5.5–8 | ${\mathbf{\delta}}_{\mathbf{\sigma}}$ | 0–4 | 5–14 | 15–32 |

Step Size = 0.25 | Step Size = 0.5 | Step Size = 1 | ||||||

search space for $\sigma $ | 1 | 6 | 11 | 16 | search space for $\sigma $ | 10 | 20 | 30 |

3 | 8 | 13 | 18 | 12 | 22 | 32 | ||

5 | 10 | 15 | 20 | 14 | 24 | 34 | ||

7 | 12 | 17 | 22 | 16 | 26 | 36 | ||

9 | 14 | 19 | 24 | 18 | 28 | 38 | ||

11 | 16 | 21 | 26 | 20 | 30 | 40 | ||

13 | 18 | 23 | 28 | 22 | 32 | 42 | ||

15 | 20 | 25 | 30 | 24 | 34 | 44 | ||

17 | 22 | 27 | 32 | 26 | 36 | 46 | ||

19 | 24 | 29 | 34 | 28 | 38 | 48 | ||

21 | 26 | 31 | 36 | 30 | 40 | 50 |

#### Appendix A.3. Distribution Specifications for Basis Function Coefficients in Rotation Mixing

**Table A2.**Specifications of distributions for the rotation mixing. They are a subset of the distributions in [28] (Table 3), and $\mathbf{Z}$ is a proxy for both $\mathbf{X}$ and $\mathbf{Y}$.

Distribution | Fourier Basis Function Coefficients | |
---|---|---|

${\mathbf{\xi}}_{\mathbf{Z}\mathbf{i}\mathbf{1}}$ | ${\mathbf{\xi}}_{\mathbf{Z}\mathbf{i}\mathbf{2}}$ | |

Exponential | $\lambda =1.5$ | $\lambda =3$ |

Student-t | $\nu =3$ | $\nu =5$ |

Uniform | $\mathcal{U}[-10,10]$ | $\mathcal{U}[-5,5]$ |

**Figure A3.**Illustration of $\mathbf{X}$ and $\mathbf{Y}$ with (

**i**) student-t, (

**ii**) uniform, and (

**iii**) exponential basis function coefficients being mixed by different rotation angles $\theta $, ordered clockwise by increasing $\theta $.

#### Appendix A.4. SDG Dataset

**Figure 1.**Results of our mmd-based homogeneity test for nonstationary random processes: percentage of rejected ${H}_{0}$ as mean shift (

**left**) and variance shift (

**right**) are varied. Our baseline method (solid lines) is compared to [15] (dashed lines) for different sample sizes $m=n=100,200,300,500$ and $T=100$ discrete time points.

**Figure 2.**Results of homogeneity test with optimising for test power: percentage of rejected ${H}_{0}$ for mean shift (

**left**) and variance shift (

**right**) for sample sizes $m=n=100,200$ and $T=100$ discrete time points. Our optimised test power method (dotted lines) is compared to our baseline method (solid lines) and [15] (dashed lines).

**Figure 3.**Results of the hsic-based independence test: Test power for linear dependence (

**left**) and dependence through shared coefficients (

**right**) as sample size is varied for various numbers of time points. For the linear dependence, we compare our baseline results to SubCorr and SubHSIC; for the shared coefficient, we compare against two spectral approximations [18] (Section 5.1).

**Figure 4.**Results of the hsic-based independence test: Percentage of rejected ${H}_{0}$ in rotation dependence experiments for different number of discrete time points T and coefficients ${{\xi}_{\mathbf{X}}}_{i,k}$ and ${{\xi}_{\mathbf{Y}}}_{i,k}$ drawn from three distributions: (

**i**) student-t, (

**ii**) uniform, and (

**iii**) exponential (see Appendix A.3). The sample size is $m=200$. The violet dotted lines are the results of our test power maximisation.

