Volatility is log-normal-- but not for the reason you

It is impossible to discriminate between the commonly used stochastic volatility models of Heston, log-normal, and 3-over-2 on the basis of exponentially weighted averages of daily returns—even though it appears so at first sight. However, with a 5-min sampling frequency, the models can be differentiated and empirical evidence overwhelmingly favours a fast mean-reverting log-normal model.

. Estimated empirical density of the annualized measured instantaneous S&P500 variance (4552 daily observations 1996-2013 using Equation (1)) together with fitted log-normal and gamma densities.
Continuing that line of thought, how do we know the gamma density does not pass tests as well? The answer lies in realizing that a diffusion model makes stronger statements about the behaviour of volatility than just its stationary (unconditional, asymptotic) distribution. A diffusion model specifies conditional distributions at any time-horizon and for any initial value of volatility 2 . Fortunately, in the 1990s, a sound theory for statistical inference for diffusion processes was established, see Sørensen (2012) for a survey; we will draw on this theory. But we also face another problem. Equation (1) does not directly give the "true" continuous-time object that Heston and other stochastic volatility models describe. It gives a measured quantity, and as such it is contaminated by noise.
As the opening paragraph suggests, the focus of this paper is a statistical investigation of volatility models commonly used in finance. To start off, when modelling a positive quantity like variance, the log-normal distribution is a convenient choice. It is therefore not surprising that it was used for early stochastic volatility models. For a taxonomy see Table 1 in Poulsen et al. (2009). However, for option pricing the integrated variance T 0 V(u)du plays a central role and its distribution is unknown in closed form for log-normal models (intuitively because the log-normal distribution is not stable under addition). No closed-form expression is therefore available for option prices. This has diminished the academic appeal as revealed by Table 1, which shows the impact (measured by a citation count) of different stochastic volatility papers. In place, option-price approximations have been suggested. The two most popular treat the opposite ends of the parametric spectrum: Hagan et al. (2002) work with the so-called SABR model where volatility has no mean-reversion, while Fouque et al. (2000) develop approximations that work well if a strong mean-reversion is present. 2 A further complication is that for the models we consider, not all parameters are identifiable by their stationary distribution. Affine, basic Heston (1993) 7637 (310) Affine, fancy Duffie et al. (2000) 3009 (171) Log-normal, no mean-reversion κ Hagan et al. (2002) 974 (62) Log-normal, fast mean-reversion κ Fouque et al. (2000) 175 (10) 3-over-2 Carr and Sun (2007) 94 (9) All of the above Lewis (2000) 766 (43) We investigate what we can say about volatility with statistical certainty based on volatility dynamics of observed price processes, as opposed to evidence from option prices observed from option markets. The answers are in the title. More specifically, we look at three continuous-time models (Heston, and show that with daily measurements, Equation (1) cannot be used to discriminate between these models. Day-to-day changes are too noisy to represent the fine structure of the models, and an analysis like the one based on Figure 1 may not only be inconclusive, but outright misleading. For all models the log-normal density provides the best fit of the stationary distribution of measured variance. However, shifting to a 5-min observation frequency it is indeed possible to discriminate between the models as we show both by controlled simulation studies and with market data. We find overwhelming support for fast mean-reverting log-normal models, both for the S&P500 index and for individual stocks. We document that the log-normal model pass goodness-of-fit tests convincingly and is robust to jump corrections.
The rest of the paper is organized as follows: Section 2 reviews theory of models, inference, and measurement. Section 3 presents a comprehensive simulation study while Section 4 reports empirical results for US data. Finally, the epilogue of Section 5 discusses limitations of our analysis and where we might be heading from here.

Continuous-Time Stochastic Volatility Models
We model the price of a given financial asset S = (S(t)) t≥0 by a stochastic differential equation where Z is a Wiener process, µ is the drift (rate) coefficient, and the (stochastic) instantaneous variance process V = (V(t)) t≥0 is our main modelling target. Heston (1993) suggests a model where V follows a square-root process (also known as a Feller or a Cox-Ingersoll-Ross process), with mean reversion speed κ, long-term mean θ, diffusion ε, and a Wiener process W correlated to Z with coefficient ρ. This process stays strictly positive if Feller's condition is satisfied, i.e., 2κθ > 2 . Applying the Ito formula to an integrating factor Z(t) = e κt V(t) we get From this we immediately conclude that By taking conditional variance on Equation (4), using the Ito isometry and Equation (5) for the conditional mean, elementary integration gives It can further be shown 3 that the conditional distribution of V in Heston's model is a non-central chi-squared distribution with d f = 4κθ/ε 2 degrees of freedom, non-centrality parameter If the parameters satisfy the Feller condition then V has a stationary gamma distribution with shape parameter d f /2 and scale parameter ε 2 /(2κ) 4 . Arguably, our treatment of the Heston model in this paper does not fairly reflect the seminal nature of Heston's work. He demonstrated that by working with characteristic functions (or in Fourier space), closed-form expressions could be derived for option prices up to a (numerically evaluated) integral. This transform method has since been shown to work in more general so-called affine models, see Duffie et al. (2000). We get a log-normal model by letting V(t) = exp (X(t)), where X is a mean-reverting Gaussian process (also known as an Ornstein-Uhlenbeck, and Vasicek process) with mean reversion speed κ, long-term mean θ, diffusion parameter ε, and a Wiener process W correlated with Z with correlation coefficient ρ. Using an integrating factor and the fact that a deterministic integrand with respect to a Wiener process is normally distributed, we conclude that X is a Gaussian process (making V log-normal) with conditional distribution and a stationary distribution which is normal with mean θ and variance ε 2 /(2κ) (let h → ∞).
The dynamics of the variance in the 3-over-2 model is given by where κ, θ and ε are parameters. Notice that the speed of mean-reversion depends on V itself; the drift function is quadratic, not affine. Applying the Ito formula to the reciprocal variance process which implies that Y follows a square-root process with mean reversion speedκ = κθ and long-term meanθ = (κ + ε 2 )/(κθ). We will refer to the model specified by Equation (8) as the reciprocal 3-over-2 model with reciprocal parameters (κ,θ,ε). This implies that V will have a stationary reciprocal gamma distribution with shape parameter 5 2κθ/ε 2 and rate parameter 2κ/ε 2 .

3
A detailed derivation of this is in Cairns (2004, Appendix B.2). 4 Notice that stationary distribution depends on κ and ε only through ε 2 /κ. This was what we meant when we talked about identification problems in an earlier footnote. 5 The first moment of the inverse-gamma distribution is defined if the shape parameter is greater than one, a condition certified by the Feller condition. In this case E[V(t)] = 2κ/(2κθ − ε 2 ) = 2κθ/(2κ + ε 2 ).
The 3-over-2 model gives a rather "wild" behaviour of variance, which is good in some contexts. Much pricing can be done by transform methods, although from our experience almost everything requires a delicate numerical treatment. The comprehensive-and in our opinion underrated-book by Lewis (2000) treats option pricing in the 3-over-2 model. The paper by Carr and Sun (2007) renewed interest in the model, 6 which was first suggested by Heston (1997) and Platen (1997). In a recent paper by Grasselli (2017), it has also been fused with Heston's CIR process for option pricing purposes, resulting in what the author calls the 4-over-2 model.

Estimation of Discretely Observed Diffusion Processes
Suppose for now that we have observed the instantaneous variance process or some non-parameter-dependent transformation of it; generically X. The observations are taken at discrete time-points t 0 , . . . , t N with spacings h i = t i+1 − t i , typically one day or a few minutes for high-frequency data. We thus have an observed path x = (x 0 , . . . , x N ) of X.
For the log-normal model with parameters ψ = (κ, θ, ε), the role of X is played by ln(V), and we can write the log-likelihood as where φ denotes the normal density, here with (conditional) mean The maximum likelihood estimateψ = (κ,θ,ε) is the argument that maximizes the log-likelihood (9). It may be found in closed form if observations are equidistant (and by a numerical optimizer if they are not). Calculating the observed information matrix by numerical differentiation at estimated values gives an estimated standard error of the jth parameter as (I −1 o ) j,j where (A −1 ) i,j denotes element i, j of the inverse of a matrix A. For the Heston model, the role of X is played by V itself. Optimization on the non-central chi-squared distribution that enters the Heston model's log-likelihood is not possible in closed form, and numerically it is delicate and slow. A simple way to obtain a consistent estimator of ψ = (κ, θ, ε) is by plugging the conditional moments into a Gaussian likelihood and maximizing. In other words, this means using the approximate log-likelihood where the conditional mean and variance are given by Equations (5) and (6). The reason this gives a consistent estimator is that the first order conditions for optimization of (10) are a set of so-called martingale estimating equations, see Sørensen (1999). We approximate standard errors of the estimated parameters from the observed information matrix of the approximated log-likelihood function by numerical differentiation. In a similar fashion for the (reciprocal) 3-over-2 model, the approximate log-likelihood (10) with 1/V in place of X, can be optimized to obtain maximum likelihood estimates of the reciprocal parameters (κ,θ,ε). These estimates may then be transformed to estimates of the original parameters 6 Older readers may remember that Chan et al. (1992), a very influential paper in the 90'ies, estimated the volatility of the short-term interest rate, r to be of the form 1.18r 1.48 , i.e., almost exactly 3/2, but their model had an affine drift, which (7) doesn't. through κ =κθ −ε 2 and θ =κ/(κθ −ε 2 ) while ε =ε. As well, standard errors of the reciprocal parameters may be calculated from the observed information matrix and standard errors of the original parameters by the delta method.

Goodness of Fit and Uniform Residuals
A goodness-of-fit analysis that takes into consideration the full conditional structure of a diffusion process X (and not just its stationary distribution) uses so-called uniform residuals, see Pedersen (1994).
The uniform residuals U 1 , . . . , U N will be independent and identically U(0, 1) distributed if the true distribution of X is the family {F i+1|i , i = 0, . . . , N − 1}. The proof of this result, that goes back to Rosenblatt (1952), is straightforward: If the random variable X has a strictly increasing, continuous distribution function F, then F(X) is U(0, 1). This is combined with the Markov property and iterated expectations are used to get the independence.
For the Heston model (and the reciprocal 3-over-2 model) we use the true conditional non-central chi-squared distribution to calculate uniform residuals-not the Gaussian density used for the parameter estimation.

Measuring Volatility
Contrary to the simplifying assumption in the previous subsection, the instantaneous variance process V is not directly observable. In practice we must measure it from observations of the asset price process S. One way to do that is by Equation (1). Another way, and one that enables us to make quantitative statements about the accuracy of the measurement, comes by following Andersen and Benzoni (2009) in a study of so-called realized volatility. To this end, denote the logarithmic asset price Y(t) = ln S(t) and the continuously compounded return over a measurement interval [t − k, t] by r(t, k) = Y(t) − Y(t − k). Applying the Ito formula to Equation (2) and putting α(t) = µ − V(t) 2 gives with quadratic variation For a partition of [t − k, t] of the form {t − k + j n , j = 1, . . . , n}, the realized variance is defined by and general semimartingale theory ensures convergence in probability 7 For high-frequency returns (a large n) RV gives a good approximation of QV and we may use to obtain a measured variance path. Thus, to adopt our 7 The convergence holds for any partition whose mesh size goes to 0.
notation, for an asset log-price Y observed at time points t 0 , . . . t N and a measurement interval [t i−n , t i ] (say one day), i.e., an observation frequency set by n (say 5 minutes), calculate . . . Note that we obtain daily variance measurements while we require asset price observations of a much higher frequency. Of course, we may reduce n to get more frequent variance measurements but for a cost of poorer approximation with respect to the convergence in (12).
We have described a simple-to-implement, model independent, and theoretically sound measure for realized and instantaneous variance. However, it is far from the only method. For instance, Reno (2008) describes the use of so-called δ-sequences and kernels for volatility measurement while Andreasen (2016, Equation (2), p. 29) suggests an estimator based on absolute, not squared, returns. A different, global time-series approach is offered by the numerous variants of GARCH models, see Engle et al. (2012). We refrain from analyzing these techniques further. Partly in the interest of space, partly because specific estimation choices might stack the deck for or against one of our three models.
From a theoretical point of view, Equation (12) tells us that the higher our sampling frequency is, the more accurately we measure quadratic variation by summing squared increments. However, high-frequency data are not just a blessing, they come with perils in the form of market microstructure effects. To illustrate, suppose the observed log-price is contaminated by (white) noise, ln S obs t = ln S(t) + (t), where all (t)'s are independent and, say, N(0, ξ). In this case an expression like (13) would contain a term of the form ) 2 , that clearly diverges as n → ∞. This means that if we use Equation (13) naively, there is an optimal observation frequency. Where that optimal frequency lies depends on the shape and magnitude of the market microstructure noise (which varies from market to market). Consensus in the literature says that it is less than 5 minutes (which is generally considered free from noise) but more than 1 second. However, in the case of noise and high frequencies we can do better than Equation (13). By a careful treatment of double asymptotics, Christensen et al. (2014) prove that with clever local smoothing (so-called pre-averaging) it is possible to remove market microstructure effects (like the white noise above, but also bid-ask bounce and outright outliers) without compromising too much of the measurement quality of volatility and of jumps; the latter of which we will consider below.

Realized Volatility for an Asset Price with Jumps
The asset price process in Equation (2) has continuous paths, arguably a restrictive constraint for market data of asset prices 8 . To this end, Andersen and Benzoni (2009) suggest adding a compound Poisson-type, finite activity jump-part to the price dynamics, where N = (N(t)) t≥0 is a Poisson process uncorrelated to Z, while ξ = (ξ(t)) t≥0 determines the magnitude of a jump ∆S(t) = S(t) − S(t − ) = S(t − )e ξ(t) that occurs at time t. The Ito formula for diffusions with jumps applied to Y(t) = log S(t) gives The jump term ξ(t)dN(t) is non-zero only if a jump occurs at time t and it should be understood in integral form where T 1 , . . . , T N(t) are the random jump times of N in [0, t]. The quadratic variation of Y in (14) includes both the integrated variance and a jump part while the convergence RV(t, k; n) −→ QV(t, k) as n → ∞.
still holds. As an estimator for the integrated variance part, we employ the realized bipower variation introduced by Barndorff-Nielsen and Shephard (2004) which provides a consistent estimate of the variance which is robust to jumps 9 . The ratio (RV(t, k; n) − BV(t, k; n))/RV(t, k; n) gives a measure of how much of the cumulative variance is due to jumps.

Simulation: Model Discrimination
To asses the statistical properties of our model discrimination test, we look at simulated asset price and variance data.
The first port of call when simulating solutions to stochastic differential equations is the Euler scheme, see Kloeden and Platen (1992). It provides a simple method for the generation of time-discrete approximations to sample paths. However, we quickly run into problems with the Euler scheme for the models we consider as it may produce negative variances. Therefore, we require methods that better preserve distributional properties, positivity specifically. For the log-normal model the solution is straightforward: simulate the log of the process; this is easy to do in an exact manner. Alas, the log-trick does not work for the Heston model; apply the Ito formula to see why. Therefore, we use the simulation scheme from Broadie and Kaya (2006) with a modification that approximates the integrated variance with a trapezoidal rule, see Platen and Bruti-Liberati (2010). This scheme is (almost) exact, but slow. Faster methods have been suggested, see Andersen et al. (2010). These generally preserve positivity but not exact distributional characteristics. In a similar fashion, Baldeaux (2012) provides an exact simulation method for the 3-over-2 model.

S&P 500 Data Revisited
To return to the question of credibility in the analysis of the introduction, we perform a mimicking simulation experiment. We apply the exponentially weighted moving average of Equation (1) to 4532 9 The presence of the π/2-factor indicates that this is not as obvious as it might seem. daily prices simulated from either Heston or the log-normal model with parameters given by the S & P 500 estimates in Section 4. The result for measured variance from Heston's model shows that fitting unconditional distributions leads to the the same conclusion as with the S&P 500 data: the data appears to be log-normal-see the left-hand graph in Figure 2. 10 To overcome the correlation problem and to fully exploit conditional information, we run a discriminatory test between the three models based on uniform residuals when variance is measured by Equation (1). The results are listed in the right-most column of Table 2, where we report the p-values from a chi-squared test under the null-hypothesis that residuals are uniform.For every simulation, the true underlying model is rejected with high significance-as are the wrong models. Variance measured by Equation (1) is simply too noisy to capture the fine structure properties implied by continuous-time stochastic volatility models. Next, we simulate returns with a 5-min frequency with n = 192, 11 i.e., a two-day aggregation, for the measurement of instantaneous variance. We choose to use of a 5-min frequency for several reasons (or, if you like, for a compromise): (i) it is common in the literature; (ii) data is available for a large range of markets from plain vanilla (although not open source) academic databases; (iii) our experiments show that model separation is possible; and (iv) we avoid contamination from market microstructure noise 12 . We then apply Equation (13) and run the uniform residuals goodness-of-fit test on this data. This gives the results reported in the second-to-right column of Table 2. Here we achieve model discrimination. In all three cases (across models and simulations), false models are rejected while the true model is accepted. We also apply the goodness-of-fit analysis on the simulated variance itself. In this way we investigate if there is any gain in going beyond a 5-min. sampling frequency. The results in the second-to-left column in Table 2 show that there isn't. Finally, we run 1000 repetitions for each of the above test cases and record p-values across all test alternatives. The results are consistent with Table 2: for close to all repetitions (∼99-100%) we achieve model discrimination based on the 5-min realized volatility, while the exponentially moving average rejects all models, in all cases.

Empirics: Horseraces between Models
The empirical analysis is based on data from the S&P 500 index (or, to be precise, an exchange traded fund (ETF) that tracks the S&P 500 index) and ten arbitrarily chosen securities listed on the New York Stock Exchange (NYSE). Their ticker symbols and corresponding number of observations are given in Table 4. All data is retrieved from the TAQ database via Wharton Research Data Services. The raw data consists of intraday tick-by-tick trade data. This data is, however, not directly suited for analysis and we apply a series of preprocessing methods. Specifically, we use the step-by-step cleaning procedure proposed by Barndorff-Nielsen et al. (2009). First, entries with zero prices are deleted, as are entries with an abnormal sale condition indicated from the code of the trade. The data is then restricted to include trades only from the exchange where the security is listed. Entries with the same time stamp are replaced by the median price in the last step.
The cleaned data will be on an irregular tick-by-tick time scale. We aggregates prices to an equidistant 5-min time grid by taking the last realized price before each grid point. Finally, the data is restricted to exchange trading hours 9:30 a.m. to 4:00 p.m. from Monday to Friday.
The realized volatility measure is applied with a resolution of n = 192 prices per variance measurement for the first step of the test methodology and a plot of the resulting variance path of 1920 observations is shown in the right-hand graph of Figure 3. The measured variance is used in the second step for the maximum (approximate) likelihood estimation where a numerical optimization is performed on the likelihood function for each model. Resulting estimated parameters and standard errors are reported in Table 3. Note that even though each of the parameters has the same qualitative interpretation across models (θ ∼ mean level, κ ∼ speed of mean-reversion, ε ∼ volatility of volatility), their numerical values cannot be compared directly. Table 3. S&P 500: Estimated parameters and standard errors from (approximate) maximum likelihood estimation of the Heston, log-normal, and 3-over-2 models.
As a sanity check, let us take a look at typical levels of instantaneous volatility that are implied by estimated parameters. For the Heston model it is √ 0.022 = 0.148 and for the log-normal model it is e −3.65/2 = 0.161-both in line with what we would anticipate. However, we have to be careful here as non-linear functions and parameters are in a range where Jensen's inequality matters. For instance, in the log-normal model, the stationary distribution for √ V = e X/2 has a mean e (−3.65+11.5 2 /(2 * 56.1))/2 = 0.291. The 3-over-2 parameter estimates look strange at first sight, but we can think of them this way. From Equation (8) the reciprocal parameters are ( κ, θ, ε) = (81, 119, 98.2). From the right-hand panel of Figure 3 we see that 119 is a typical value for reciprocal variance, and that this process covers a large range of values leading to combined high volatility and high speed of mean-reversion. The morale is that a non-linear and ill-fitting (see the right-most panel in Figure 4) model is hard to interpret.
To get a quantitative feeling for the speed of mean-reversion parameters, we use that for a process with linear mean-reversion, the time it takes to halve the deviation in expectation between current level and long-term level is X(t)e −κt 1/2 + θ(1 − e −κt 1/2 ) = (θ + X(t))/2 ⇒ t 1/2 = ln(2)/κ. The estimates imply that for Heston model the half-life of deviances is about eight days, for the log-normal model the half-life (for log-variance) is just over three days. This is strong mean-reversion. Or differently put, mean-reversion is in effect over short time-scales. For interest rate models, typical κ-values are 0.1-0.25 which implies deviance half-life of 3-7 years. Such strong mean-reversion plays havoc with our usual "quantitative intuitive interpretation", where it determines the standard deviation of the change (in absolute or relative terms) of the process over one year, i.e., a change of ± the volatility is "nothing out of the ordinary". That interpretation is non-sensical here. If applied it would tell us that the instantaneous variance in Heston is typically in the range 0.022 ± 0.44, and that volatilities of 5000% and 0.05% are imminently plausible in the log-normal model. It is the combination of high speed of mean-reversion and high volatility of volatility that is needed to generate the decidedly spiky behaviour of volatility which is what our likelihood approach aims at capturing, For the Heston model we have 2κθ = 0.95 <ε 2 = 8.9, so the Feller condition is violated by a factor 10. However, what we think is the most striking result in Table 3 is the last column, which contains goodness-of-fit tests for the different models: p-values for Kolmogorov-Smirnov on the uniform residuals. The log-normal model provides a (surprisingly) good fit (p = 0.68), Heston and 3-over-2 do not (p < 10 −12 ); the quantile plots in Figure 4 gives further visual evidence.
Comparing our findings to the literature, our most clear ally in the recent empirical literature is Christoffersen et al. (2010) who also find strong support for log-normal('ish) volatility models compared to Heston and 3-over-2. They only use daily data, so their parameter values (see Table 1 in Christoffersen et al. (2010)) do not pick up the fast mean-reverting feature. In that respect, our results are most in line with Fouque et al. (2000) who use a log-normal model (without explicit discriminatory goodness-of-fit test) and document strong mean-reversion (even stronger than ours; κ ∼ 200-250-although other work by the same authors suggests κ = 50). Similarly to us Andersen et al. (2001) (Figures 1-3 for instance) report log-normality of variance measured on 5-min sampling. They do not find fast mean-reversion, but rather that volatility has long-term memory.

Individual Stocks
A sceptic might say: "The log-normality of the index is an aggregation effect: some stocks are Heston, some are 3-over-2, and when you mix them all up, it comes out as log-normal". However, it is not, as results summarised in Table 5 show. Heston and the 3-over-2 models are rejected with all p-values indistinguishable from 0, the log-normal model is accepted for all 10 stocks (although 2 are borderline), mean-reversion is strong (all κ's > 50), and volatility of volatility is correspondingly high in order to generate spikes.

Jump Corrections
The sceptic might continue: "When you use Equation (13) you pick up both jumps and diffusive volatility. It think it's the jumps that cause log-normality. For the diffusive part Heston will do". (Conveniently, options can be priced using transform methods from Duffie et al. (2000).) Table 6 shows that log-normality of variance prevails when it is measured through bipower variation.

Conclusions and Future Work
In this paper we have demonstrated that log-normal models provide a significantly better description of the empirical behaviour of instantaneous variance than the Heston and the 3-over-2 models. However, at coarser time-resolutions it can be hard to discriminate between the models. Our intention has been to give a reasonably self-contained treatment of the theory and used models and methods, rather than to facilitate a replication of the results 13 .
We find that to replicate the spiky but stationary behaviour of volatility, we need a combination of strong mean-reversion and high volatility of volatility. From a modelling perspective this suggests that we look beyond the diffusive model, towards models with higher degree of path roughness (as measured by degree of Hölder continuity) of volatility. A promising line of research are the rough volatility models suggested in Gatheral et al. (2018) (see also Andreasen (2017)) where volatility is driven by a fractional Brownian motion.
Finally, it must be mentioned that the strongly mean-reverting volatility diffusion models that we investigate go against an often-documented (see the discussion in Bennedsen et al. (2017)) empirical property of volatility: long memory (persistence) at low frequencies. A natural solution (LeBaron (2001), Fouque et al. (2003)) in the realm of diffusions is to model instantaneous volatility as a sum of component process with vastly differing speeds of mean-reversion. It is less obvious how to do this in a fractional setting (where path roughness and persistence are exact opposites), but Bennedsen et al. (2017) suggest the use of semistationary processes. To us, the big question in this context is: What matters where? For which problems is the high frequency important, for which is the low frequency important? Are there problems with mixed effects? Our hunch is high frequency for option pricing and low frequency for portfolio choice, but we are yet to see quantitative evidence for this-although Bollerslev et al. (2018) is a promising framework for investigations.
Author Contributions: Martin Tegnér and Rolf Poulsen conceived and designed the experiments and wrote the paper; Martin Tegnér did the programming and performed the computational and end empirical experiments.