# Quantifying Emergent Behavior of Autonomous Robots

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Methods

#### 2.1. Entropy, Dimension and Excess Entropy

**Conditional entropy and mutual information**Let us consider two discrete-valued random variables X and Y with values $x\in \mathcal{X}$ and $y\in \mathcal{Y}$, respectively. Then the uncertainty of a measurement of X is quantified by $H\left(X\right)$. Now we might ask, what is the average remaining uncertainty about X if we have seen already Y? This is quantified by the conditional entropy

**Predictive information, excess entropy and entropy rate**The unpredictability of a time series can be characterized by the conditional entropy of the next state given the previous states. In the following we will use an abbreviated notation for these conditional entropies and the involved entropies:

#### 2.2. Decomposing the Excess Entropy for Continuous States

- by increasing dimension D of the dynamics
- by decreasing the noise level ${\epsilon}^{*}$
- by increasing the amplitude ${\epsilon}_{D}$
- by increasing the state complexity
- by increasing the correlations measured by the “memory” complexity, i.e., by increasing the predictability
- by decreasing the entropy rate ${h}_{KS}$, i.e., by decreasing the unpredictability

#### 2.3. Methods for Estimating the Information Theoretic Measures

**Estimation via local densities from nearest neighbor statistics (KSG)**The most common approach to estimate information quantities of continuous processes, such as the mutual information, is to calculate the differential entropies (1) directly from the nearest neighbor statistics. The key idea is to use nearest neighbor distances [33,34,35] as proxies for the local probability density. This method corresponds in a way to an adaptive bin-size for each data point. For the mutual information $I(X,Y)$ (required e.g., to calculate the PI (Equation (18))), however, it is not recommended to naively calculate it directly from the individual entropies of X, Y and their joint $(X,Y)$ because they may have very dissimilar scale such that the adaptive binning leads to spurious results. For that reason a new methods was proposed by Kraskov et al. [26], that we call KSG, which only uses the nearest neighbor statistics in the joint space. We denote ${I}_{\mathrm{KSG}}^{\left(k\right)}(X,Y)$ the mutual information estimate where k nearest neighbors where used for the local estimation.

**Estimation via correlation sum**The correlation sum is one of the standard tools in nonlinear time series analysis [36,37]. Normally it is used to estimate the attractor dimension. However, it can also be used to provide approximate estimates of entropies and derived quantities such as the excess entropy. The correlation entropies for a random variable $\overrightarrow{X}$ with measure $\mu \left(\overrightarrow{x}\right)$ are defined as [25]

#### 2.4. Algorithm for Excess Entropy Decomposition

**Preprocessing:**Ideally the $\delta h$ curves are composed of straight lines in a log-linear representation, i.e., of the form $o-slog\left(\epsilon \right)$. We will refer to s as the slope (it is actually the inverted slope). Thus we perform fitting, that attempts to find segments following this form, details are provided in the Appendix B. Then the data is substituted by the fits in the intervals where the fits are valid. As for very small scales the $\delta h$ become very noisy we extrapolate below the fit with the smallest scale. In addition we calculate the derivative $\widehat{s}(m,\epsilon )=\frac{d\phantom{\rule{0.166667em}{0ex}}\delta {h}_{m}}{d\phantom{\rule{0.166667em}{0ex}}\epsilon}$ in each point, either from the fits (s, where available) or from finite differences of the data (using 5 points averaging).

**Determining MT:**In theory only two $\delta {h}_{m}$ should have a non-zero slope at each scale ε, see Equation (21). However, in practice we often have more terms, such that we need to find for each ε the maximal range $({m}_{l},{m}_{u})$, where $\forall i\in [{m}_{l},{m}_{u}]:\widehat{s}(i,\epsilon )>{s}_{\mathrm{min}}$, i.e., the slope is larger than the threshold ${s}_{\mathrm{min}}$. However, this is only valid for deterministic scaling ranges. In stochastic ranges all $\delta h$ should have zero slope. We introduce a measure of stochasticity, defined as $\kappa (m,\epsilon )=1-{\sum}_{k={m}_{l}}^{{m}_{u}}\widehat{s}(k,\epsilon )$ which is 0 for purely deterministic ranges and 1 for stochastic ones. The separation between state and memory complexity is then inherited from the next larger deterministic range. Thus if $\kappa (m,\epsilon )\ge {\kappa}_{\mathrm{max}}$ we use $({m}_{l},{m}_{u})$ at ${\epsilon}^{*}$, where ${\epsilon}^{*}={arg\; min}_{e\in (\epsilon ,\infty )}\kappa (m,e)<{\kappa}_{\mathrm{max}}$. Note that the here algorithmically defined ${\u03f5}^{*}$ is not necessarily equal to the ${\u03f5}^{*}$ defined above Equation (28) for an ideal-typical noisy deterministic system.

**Determining the constant in MT:**In order to obtain the scale-invariant constant c of the MT, see Equation (23), we would have to define a certain length scale ${\epsilon}_{D}$. Since this cannot be done robustly in practice (in particular because it may not be the same ${\epsilon}_{D}$ for each m) we resort to a different approach. The constant parts of the $\delta {h}_{{m}_{l}\dots {m}_{u}}$ terms in the MC can be determined from plateaus on larger scales. Thus, we define ${c}_{m}^{\mathrm{MT}}\left(\epsilon \right)={min}_{e\in (\epsilon ,{\epsilon}^{*}]}\delta {h}_{m}\left(e\right)$, where ${\epsilon}^{*}$ is smallest scale $>\epsilon $ where we have a near-zero slope, i.e., ${\epsilon}^{*}={arg\; min}_{e\in (\epsilon ,\infty )}s(m,e)<{s}_{\mathrm{min}}$. In case there is no such ${\epsilon}^{*}$ then ${c}_{m}^{\mathrm{MT}}=0$.

**Decomposition and quality measures:**The decomposition of the excess entropy follows Equations (24) and (25) with ${m}_{l}$ and ${m}_{u}$ used for splitting the terms:

**Table 1.**Quality measures for decomposition algorithm. We use the Iverson bracket for Boolean expression: $\u301a\mathrm{True}\u301b:=1$ and $\u301a\mathrm{False}\u301b:=0$. They are all normalized to $[0,1]$ where typically 0 is the best score and 1 is the worst.

Quantity | Definition | Description |
---|---|---|

κ=stochastic(ε) | $1-{\sum}_{k={m}_{l}}^{{m}_{u}}\widehat{s}(k,\epsilon )$ | 0: fully deterministic, 1: fully stochastic at ε |

% negative(ε) | $1/m{\sum}_{k=1}^{m}\u301a\delta h(k,\epsilon )<0\u301b$ | percentage of negative $\delta h$ |

% no fits(ε) | $1/m{\sum}_{k=1}^{m}\u301a\mathrm{valid}\mathrm{fit}\mathrm{for}\delta h(k,\epsilon )\u301b$ | percentage of $\delta h$ where no fit is available |

% extrap.(ε) | $1/m{\sum}_{k=1}^{m}\u301a\mathrm{is}\delta h(k,\epsilon )\mathrm{extrapolated}\u301b$ | percentage of $\delta h$ that where extrapolated |

#### 2.5. Illustrative Example

**Deterministic system: Lorenz attractor**The Lorenz attractor is obtained as the solution to the following differential equations:

**Figure 1.**Lorenz attractor with standard parameters.

**(a)**Trajectory in state space $(x,y,z)$.

**(b)**Embedding of x with $m=3$, time-delay $\tau =10$, and $\Delta =0.01$.

**Stochastic system: noisy Lorenz attractor**In order to illustrate the fundamental differences between deterministic and stochastic systems when analyzed with information theoretic quantities we consider now the Lorenz dynamical system with dynamic noise (additive noise to the state ($x,y,z$) in Equations (38)–(40) before each integration step) as provided by the TISEAN package [41]. The dimension of a stochastic systems is infinite, i.e., for embedding m the correlation integral yields the full embedding dimension as shown in Figure 2b,c for small ε.

**Figure 2.**Correlation dimension ${D}_{m}^{\left(2\right)}$ (Equation (34)) (

**a**)–(

**c**), conditional block entropies ${h}_{m}^{\left(2\right)}$ (Equation (35)) (

**d**)–(

**f**), excess entropy ${E}_{m}^{\left(2\right)}$ (Equation (36) (

**g**)–(

**i**), and predictive information ${PI}_{m}$ (Equation (18)) (

**j**)–(

**l**) of the Lorenz attractor estimated with the correlation sum method (

**a**)–(

**i**) and KSG (

**j**)–(

**l**) for the deterministic system (first column) and with absolute dynamic noise $0.005$, and $0.01$ (second and third column respectively). The error bars in (

**j**)–(

**l**) show the values calculated on half of the data. All quantities are given in nats (natural unit of information with base e) and in dependence of the scale (ε,η in space of x (Equation (38))) for a range of embeddings m, see color code. In (

**d**)–(

**f**) the fits for ${h}_{0}={H}_{1}$ allow to determine ${H}^{C}\phantom{\rule{-0.166667em}{0ex}}=2.65$ and for ${h}_{m}$ give ${h}_{KS}$ and ${h}_{m}^{c}$, Equations (27)–(30). Parameters: delay embedding of x with $\tau =10$.

**Figure 3.**Excess Entropy decomposition for the Lorenz attractor into state complexity (blue shading), memory complexity (red shading), and ε-dependent part ${E}_{\epsilon}$ (beige shading) (all in nats). Columns as in Figure 2: deterministic system (a), with absolute dynamic noise $0.005$ (b), and $0.01$ (c). A set of quality measures and additional information is displayed on the top using the right axis. ${m}_{l}$ and ${m}_{u}$ refer to the terms in ${E}_{\epsilon}$ (Equation (37)). Scaling ranges that are identified as stochastic are shaded in gray (stochastic indicator $>{\kappa}_{\mathrm{max}}=0.5$). In manually chosen ranges, marked with vertical black lines, we evaluate the mean and standard deviation of the memory- and state complexity. Parameters: ${s}_{\mathrm{min}}=0.1$.

**Table 2.**Decomposition of the excess entropy for the Lorenz system. Presented are the unrefined constant and dimension of the excess entropy, the state-, memory-, and core complexity (in nats) determined in the deterministic scaling range, see Figure 3.

Determ. | Noise $0.005$ | Noise $0.01$ | Noise $0.02$ | ||
---|---|---|---|---|---|

Equation (16) | D | 2.11 | 1.95 | 1.68 | 1.59 |

Equation (16) | $\mathrm{const}$ | 4.62 | 4.55 | 4.37 | 4.21 |

Equation (27) | ${\epsilon}^{*}$ | 0 | 0.12 | 0.23 | 0.45 |

Equation (24) | ${E}_{\mathrm{state}}$ | 0.68 ± 0 | 0.68 ± 0 | 0 | 0 |

Equation (25) | ${E}_{\mathrm{mem}}$ | 1.86 ± 0.06 | 1.27 ± 0.02 | 1.98 ± 0.11 | 1.2 ± 0.1 |

Equation (26) | ${E}_{\mathrm{core}}$ | 2.54 ± 0.06 | 1.95 ± 0.02 | 1.98 ± 0.11 | 1.2 ± 0.1 |

## 3. Results

#### 3.1. Application to Robotics: Controller and Learning

#### 3.2. Experiments

**Table 3.**Experiments, data sets and videos, see [18].

Robot | Behavior | Length | Dataset | Video |
---|---|---|---|---|

Snake | side rolling | $9\times {10}^{5}$ | D-S1 | Video S1 |

Snake | crawling | $9\times {10}^{5}$ | D-S2 | Video S2 |

Hexapod | jumping after 2 min | ${10}^{6}$ | D-H1 | Video H1 |

Hexapod | jumping after 4 min | ${10}^{6}$ | D-H2 | Video H2 |

Hexapod | jumping after 8 min | ${10}^{6}$ | D-H3 | Video H3 |

**Figure 4.**Robots used for the experiments.

**(a)**The Hexapod . 18 actuated DoF: 2 shoulder and 1 knee joint per leg.

**(b)**The Snake . 14 DoF: 8 segments connected by 2 DoF joints each.

**Figure 5.**Side-rolling and crawling of the Snake . Note that also for rolling each actuator has to act accordingly because no torsion is possible, see Figure 4b.

#### 3.3. Quantifying the Behavior

#### 3.3.1. Snake

**Figure 7.**Quantification of two Snake behaviors. First column: side rolling; second column: crawling (see Figure 5(

**a**,

**b**). Phase plots with Poincaré sections at the maximum of first and second embedding. (

**e**,

**f**) Excess entropy with decomposition (in nats), see Figure 3 for more details. (

**g**,

**h**) Predictive information (in nats), see Figure 2 for more details. Parameters: delay embedding of sensor ${x}_{7}$ (middle segment) with $\tau =8$, ${s}_{\mathrm{min}}=0.05$, $\kappa =0.5$.

**Table 4.**Decomposition of the excess entropy for the Snake. Dimension and unrefined constant, state-, memory-, and core complexity (in nats) on the coarse deterministic scaling range $\epsilon \in (0.05,0.6)$.

Side Rolling | Crawling | ||
---|---|---|---|

Equation Equation (16) | D | 1.07 | 1.15 |

Equation Equation (16) | $\mathrm{const}$ | 0.7 | 0.53 |

Equation Equation (24) | ${E}_{\mathrm{state}}$ | 0.0 ± 0 | 0 ± 0 |

Equation Equation (25) | ${E}_{\mathrm{mem}}$ | 0.69 ± 0.14 | 0.94 ± 0.16 |

Equation Equation (26) | ${E}_{\mathrm{core}}$ | 0.69 ± 0.14 | 0.94 ± 0.16 |

**Figure 8.**Quantification of the three Hexapod behaviors. Behaviors: first column: 2 min; second column: 4 min; third column 8 min, see Figure 6. (

**a**)–(

**c**) Phase plots with Poincaré sections at the maximum and minimum of first and second embedding. The color of the trajectory (blue–black) encodes for the third embedding dimension. (

**g**)–(

**i**) Excess entropy with decomposition (in nats), see Figure 3 for more details. The excess entropy fit at the smallest scale in (

**g**) is $-18-5log\left(\epsilon \right)$. (

**j**)–(

**l**) Predictive information (in nats), see Figure 2 for more details. Parameters: delay embedding of sensor ${x}_{1}$ (up-down direction of right hind leg) with $\tau =4$, ${s}_{\mathrm{min}}=0.05$, ${\kappa}_{\mathrm{max}}=0.5$.

#### 3.3.2. Hexapod

**Table 5.**Decomposition of the excess entropy for the Hexapod behaviors on the fine and coarse scale. For the 4 min and 8 min behavior we have no reliable estimate on the fine scale. The core complexity and the dimension rises from 2 min to 4 min. From 4 min to 8 min the dimension stayed the same but the core complexity increases. The last line provides the predictive information on the smallest scale.

2 min (Fine) | 2 min (Coarse) | 4 min (Coarse) | 8 min (Coarse) | ||
---|---|---|---|---|---|

ε-Range | $(\mathbf{0}.\mathbf{006},\mathbf{0}.\mathbf{025})$ | $(\mathbf{0}.\mathbf{1},\mathbf{0}.\mathbf{4})$ | $(\mathbf{0}.\mathbf{05},\mathbf{0}.\mathbf{2})$ | $(\mathbf{0}.\mathbf{1},\mathbf{0}.\mathbf{4})$ | |

Equation (16) | D | 1.6 | 1.25 | 3.36 | 3.45 |

Equation (16) | $\mathrm{const}$ | −0.09 | −0.2 | −0.81 | 0.03 |

Equation (24) | ${E}_{\mathrm{state}}$ | 0.7 ± 0.11 | 0 ± 0 | 0.17 ± 0.03 | 0 ± 0 |

Equation (25) | ${E}_{\mathrm{mem}}$ | 2.1 ± 0.27 | 0.85 ± 0.16 | 3.74 ± 0.54 | 4.79 ± 1.03 |

Equation (26) | ${E}_{\mathrm{core}}$ | 2.8 ± 0.38 | 0.85 ± 0.16 | 3.91 ± 0.57 | 4.79 ± 1.03 |

Equation (18) | $PI(\eta =0)$ | $\approx 9$ | - | $\approx 5.5$ | $\approx 4$ |

## 4. Discussion

## Acknowledgments

## Author Contributions

## Conflicts of Interest

## Appendix

## Appendix A. Comparison and Validation of Estimators Using Autoregressive Model of the Lorenz Attractor

**Figure A1.**Analysis of an AR${}_{2}$-model (${a}_{1}=1.991843,{a}_{2}=-0.994793$) of the Lorenz attractor observed at delay 1 and delay 10. See Figure 2 for details. (

**a**,

**b**) The dimensionality corresponds to the embedding dimension (as long as the data suffices). (

**c**,

**d**) Conditional entropies show the entropy of the noise for $m>2$. The difference between first and second, and second and third embedding reveal the actual structure of the time-series, better visible in (

**e**) and (

**f**). (

**e**,

**f**) The excess entropy converges to the theoretical values of $2.91$ for $m=1$ and to $7.48$ for $m\ge 2$ (delay 1) and $0.67$ for $m=1$ and to $3.36$ for $m\ge 2$ (delay 10). (

**g**,

**h**) Predictive information estimated with KSG algorithm with $k=20$.

## Appendix B. Details of the Decomposition Algorithm

#### **Fitting**

**Figure A2.**Illustration of the fitting algorithm and the determination of ${c}_{m}^{\mathrm{MT}}$ and ${q}_{\mathrm{max}}$. (

**a**) Shown are the $\delta {h}_{m}$ curves for the Lorenz system with 0.005 dynamic noise, see Figure 2. The fits are shown by dotted gray lines and their validity range r is marked by triangles. The dashed colored lines show ${c}_{m}^{\mathrm{MT}}$ for $m=1,2,3$. (

**b**) Ranked fit qualities for all segments with 10 data points of $\delta {h}_{1}$ (solid line) with $25\%$ percentile (dotted line) and threshold ${q}_{\mathrm{max}}$ (dashed red line). For visibility y-axis is cut, $maxq=0.013$.

## References

- Ay, N.; Bertschinger, N.; Der, R.; Güttler, F.; Olbrich, E. Predictive information and explorative behavior of autonomous robots. Eur. Phys. J. B
**2008**, 63, 329–339. [Google Scholar] [CrossRef] - Zahedi, K.; Ay, N.; Der, R. Higher coordination with less control—A result of information maximization in the sensorimotor loop. Adapt. Behav.
**2010**, 18, 338–355. [Google Scholar] [CrossRef] - Klyubin, A.S.; Polani, D.; Nehaniv, C.L. Empowerment: A universal agent-centric measure of control. In Proceedings of the IEEE Congress on Evolutionary Computation, Edinburgh, Scotland, UK, 5–5 September 2005; Volume 1, pp. 128–135.
- Salge, C.; Glackin, C.; Polani, D. Empowerment—An Introduction. In Guided Self-Organization: Inception; Emergence, Complexity and Computation; Prokopenko, M., Ed.; Springer: Berlin/Heidelberg, Germany, 2014; Volume 9, pp. 67–114. [Google Scholar]
- Schmidhuber, J. Exploring the Predictable. In Advances in Evolutionary Computing; Ghosh, A., Tsuitsui, S., Eds.; Springer: Berlin, Germany, 2002; pp. 579–612. [Google Scholar]
- Oudeyer, P.Y.; Kaplan, F.; Hafner, V.V.; Whyte, A. The Playground Experiment: Task-Independent Development of a Curious Robot; Proceedings of the AAAI Spring Symposium on Developmental Robotics; Bank, D., Meeden, L., Eds.; Stanford: California, USA, 2005; pp. 42–47. [Google Scholar]
- Frank, M.; Leitner, J.; Stollenga, M.; Förster, A.; Schmidhuber, J. Curiosity Driven Reinforcement Learning for Motion Planning on Humanoids. Front. Neurorobotics
**2014**, 7. [Google Scholar] [CrossRef] [PubMed] - Der, R.; Martius, G. The Playful Machine—Theoretical Foundation and Practical Realization of Self-Organizing Robots; Springer: Berlin/Heidelberg, Germany, 2012. [Google Scholar]
- Martius, G.; Der, R.; Ay, N. Information Driven Self-Organization of Complex Robotic Behaviors. PLoS ONE
**2013**, 8. [Google Scholar] [CrossRef] - Sayama, H. Guiding designs of self-organizing swarms: Interactive and automated approaches. In Guided Self-Organization: Inception, Emergence, Complexity and Computation; Prokopenko, M., Ed.; Springer: Berlin/Heidelberg, Germany, 2014; Volume 9, pp. 365–387. [Google Scholar]
- Der, R.; Martius, G. Behavior as broken symmetry in embodied self-organizing robots. In Advances in Artificial Life, ECAL 2013; MIT Press: Cambridge, MA, USA, 2013; pp. 601–608. [Google Scholar]
- Der, R.; Martius, G. A Novel Plasticity Rule Can Explain the Development of Sensorimotor Intelligence. Proc. Natl. Acad. Sci. USA
**2015**, in press. [Google Scholar] - Bialek, W.; Nemenman, I.; Tishby, N. Predictability, Complexity and Learning. Neural Comput.
**2001**, 13, 2409–2643. [Google Scholar] [CrossRef] [PubMed] - Shaw, R. The Dripping Faucet as a Model Chaotic System; Aerial Press: Santa Cruz, CA, USA, 1984. [Google Scholar]
- Crutchfield, J.P.; Feldman, D.P. Regularities unseen, randomness observed: Levels of entropy convergence. Chaos
**2003**, 13, 25–54. [Google Scholar] [CrossRef] [PubMed] - Grassberger, P. Toward a quantitative theory of self-generated complexity. Int. J. Theor. Phys.
**1986**, 25, 907–938. [Google Scholar] [CrossRef] - Van der Weele, J.P.; Banning, E.J. Mode interaction in horses, tea, and other nonlinear oscillators: The universal role of symmetry. Am. J. Phys.
**2001**, 69, 953–965. [Google Scholar] [CrossRef] - Martius, G.; Olbrich, E. Supplementary Material. Available online: http://playfulmachines.com/QuantBeh2015/ (accessed on 20 October 2015).
- Martius, G. Source Code at the Repository. Available online: https://github.com/georgmartius/behavior-quant (accessed on 20 October 2015).
- Grassberger, P. Randomness, Information, and Complexity; Proceedings of the 5th Mexican School on Statistical Physics (EMFE 5); Ramos-Gómez, F., Ed.; World Scientific: Singapore, Singapore, 1991; corrected version arXiv:1208.3459. [Google Scholar]
- Prokopenko, M.; Boschetti, F.; Ryan, A.J. An information-theoretic primer on complexity, self-organization, and emergence. Complexity
**2009**, 15, 11–28. [Google Scholar] [CrossRef] - Rosenblatt, M. Remarks on some nonparametric estimates of a density function. Ann. Math. Statist.
**1956**, 27, 832–837. [Google Scholar] [CrossRef] - Shore, J.E.; Johnson, R.W. Axiomatic derivation of the principle of maximum entropy and the principle of minimum cross-entropy. IEEE Trans. Inf. Theory
**1980**, IT-26, 26–37. [Google Scholar] [CrossRef] - Giffin, A. From physics to economics: An econometric example using maximum relative entropy. Physica A
**2009**, 388, 1610–1620. [Google Scholar] [CrossRef] - Takens, F.; Verbitski, E. Generalized entropies: Rényi and correlation integral approach. Nonlinearity
**1998**, 11. [Google Scholar] [CrossRef] - Kraskov, A.; Stögbauer, H.; Grassberger, P. Estimating mutual information. Phys. Rev. E
**2004**, 69. [Google Scholar] [CrossRef] - Gaspard, P.; Wang, X.J. Noise, chaos and (ϵ,τ)-entropy per unit time. Phys. Rep.
**1993**, 235, 291–343. [Google Scholar] [CrossRef] - Cover, T.M.; Thomas, J.A. Elements of Information Theory; Wiley: New York, NY, USA, 2006. [Google Scholar]
- Ding, M.; Grebogi, C.; Ott, E.; Sauer, T.; Yorke, J.A. Plateau onset of correlation dimension: When does it occur? Phys. Rev. Lett.
**1993**, 70, 3872–3875. [Google Scholar] [CrossRef] [PubMed] - Cohen, A.; Procaccia, I. Computing the Kolmogorov entropy from time signals of dissipative and conservative dynamical systems. Phys. Rev. A
**1985**, 31, 1872–1882. [Google Scholar] [CrossRef] [PubMed] - Sinai, Y. Kolmogorov-Sinai entropy. Scholarpedia
**2009**, 4. [Google Scholar] [CrossRef] - Sauer, T.; Yorke, J.A.; Casdagli, M. Embedology. J. Stat. Phys.
**1991**, 65, 579–616. [Google Scholar] [CrossRef] - Vasicek, O. A test for normality based on sample entropy. J. Royal Stat. Soc. Ser. B
**1976**, 38, 54–59. [Google Scholar] - Dobrushin, R. A simplified method of experimentally evaluating the entropy of a stationary sequence. Theory Probab. Appl.
**1958**, 3, 428–430. [Google Scholar] [CrossRef] - Kozachenko, L.; Leonenko, N.N. Sample estimate of the entropy of a random vector. Problemy Peredachi Informatsii
**1987**, 23, 9–16. [Google Scholar] - Kantz, H.; Schreiber, T. Nonlinear Time Series Analysis; Cambridge University Press: Cambridge, UK, 2004; Chapter 6; Volume 7. [Google Scholar]
- Grassberger, P. Grassberger-Procaccia algorithm. Scholarpedia
**2007**, 2. [Google Scholar] [CrossRef] - Grassberger, P.; Procaccia, I. Characterization of strange attractors. Phys. Rev. Lett.
**1983**, 50. [Google Scholar] [CrossRef] - Fraser, A.M.; Swinney, H.L. Independent coordinates for strange attractors from mutual information. Phys. Rev. A
**1986**, 33, 1134–1140. [Google Scholar] [CrossRef] [PubMed] - Olbrich, E.; Kantz, H. Inferring chaotic dynamics from time-series: On which length scale determinism becomes visible. Phys. Lett. A
**1997**, 232, 63–69. [Google Scholar] [CrossRef] - Hegger, R.; Kantz, H.; Schreiber, T. Practical implementation of nonlinear time series methods: The TISEAN package. Chaos
**1999**, 9, 413–435. [Google Scholar] [CrossRef] [PubMed] - Der, R. On the role of embodiment for self-organizing robots: Behavior as broken symmetry. In Guided Self-Organization: Inception, Emergence, Complexity and Computation; Prokopenko, M., Ed.; Springer: Berlin, Germany, 2014; Volume 9, pp. 193–221. [Google Scholar]
- Martius, G.; Hesse, F.; Güttler, F.; Der, R. LpzRobots: A Free and Powerful Robot Simulator. Availiable online: http://robot.informatik.uni-leipzig.de/software (accessed on 1 July 2010).
- Der, R.; Martius, G.; Hesse, F. Let It Roll—Emerging Sensorimotor Coordination in a Spherical Robot. In Artificial Life X; Rocha, L.M., Yaeger, L.S., Bedau, M.A., Floreano, D., Goldstone, R.L., Vespignani, A., Eds.; MIT Press: Cambridge, MA, USA, 2006; pp. 192–198. [Google Scholar]
- Schmidhuber, J. Curious model-building control systems. In Proceedings of the 1991 IEEE International Joint Conference on Neural Networks, Singapore, 18–21 November 1991; Volume 2, pp. 1458–1463.
- Oudeyer, P.Y.; Kaplan, F.; Hafner, V. Intrinsic motivation systems for autonomous mental development. IEEE Trans. Evolut. Comput.
**2007**, 11, 265–286. [Google Scholar] [CrossRef] - Doncieux, S.; Mouret, J.B. Beyond black-box optimization: A review of selective pressures for evolutionary robotics. Evolut. Intell.
**2014**, 7, 71–93. [Google Scholar] [CrossRef] - Lungarella, M.; Sporns, O. Mapping information flow in sensorimotor networks. PLoS Comput. Biol.
**2006**, 2. [Google Scholar] [CrossRef] [PubMed] - Schmidt, N.M.; Hoffmann, M.; Nakajima, K.; Pfeifer, R. Bootstrapping perception using information theory: Case studies in a quadruped robot running on different grounds. Adv. Complex Syst.
**2013**, 16. [Google Scholar] [CrossRef] - Wang, X.R.; Miller, J.M.; Lizier, J.T.; Prokopenko, M.; Rossi, L.F. Quantifying and tracing information cascades in swarms. PLoS ONE
**2012**, 7. [Google Scholar] [CrossRef] [PubMed]

© 2015 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Martius, G.; Olbrich, E.
Quantifying Emergent Behavior of Autonomous Robots. *Entropy* **2015**, *17*, 7266-7297.
https://doi.org/10.3390/e17107266

**AMA Style**

Martius G, Olbrich E.
Quantifying Emergent Behavior of Autonomous Robots. *Entropy*. 2015; 17(10):7266-7297.
https://doi.org/10.3390/e17107266

**Chicago/Turabian Style**

Martius, Georg, and Eckehard Olbrich.
2015. "Quantifying Emergent Behavior of Autonomous Robots" *Entropy* 17, no. 10: 7266-7297.
https://doi.org/10.3390/e17107266