2. A Statistical Test of Market Efficiency Based on Information Theory
In order to work with the various equations used in information theory, we build a binary sequence
from a financial time series of n+1 prices
. We do the following:
. We assume that the random series
is stationary. As an example of a symbolic representation of a sequence of consecutive prices, we have:
which represents a decrease followed by two daily price increases. We work with a number of lags
, so
sequences are possible, ordered with Gray’s binary code. A Gray’s binary code is an ordering of the binary numeral system such that two successive values differ in only one binary digit, we use this code to order the sequences but another order is also possible and does not change the following computations. For example, if L = 3, then the eight possible sequences are
. Then, we introduce the probability to draw a particular sequence of length
L,
, with
. We determine the probabilistic choice system
in order to define the amount of uncertainty in this probabilistic choice system with the Shannon entropy [
4] defined by:
We are now able to introduce the market entropy by decomposing a sequence (observed in time L) in:
- -
An observed prefix of probability ;
- -
A random suffix whose conditional distribution is Bernoulli of parameter .
The sequence
is thus equal to
with a probability
and
with a probability of
. The market entropy is then:
The EMH asserts that, conditionally, on the prefix
, the values of suffix
and
are equiprobable. This leads to a particular case with
, and the entropy of an efficient market is then:
With (2) and (3), we can define the market information as the difference between the entropy of an efficient market and the entropy of the market:
The theoretical value of the market information is 0 if and only if the market is efficient. We have as soon as EMH does not hold. We start by evaluating the empirical probabilities and in order to obtain the empirical market information . If the EMH holds, we may have slightly higher than 0. It is mandatory to build a statistical test for answering the question of the efficiency of the market.
The null hypothesis is: the market is efficient and we have .
Because we only have access to the estimator , we need its distribution under the null hypothesis.
In this work, we compute the moment-generating function of under the null hypothesis.
Theorem 1. For , the moment-generating function of , conditionally on the event (using the conventions and ), is:where: .
We are then able to express all the moments of under the null hypothesis.
Theorem 2. The conditional moment of order r of for iswhere: .
Theorems 1 and 2 are proven in [
5]. We also define an asymptotic market information
. We compute the asymptotic conditional moment-generating function and we demonstrated that when
n tends to infinity [
5]:
We recognize the gamma distribution of shape parameter and scale parameter . We observed that the result neither depends on nor on , which is a non-conditional moment-generating function.
By applying a Kolmogorov–Smirnov test, we observe that the asymptotic distribution is validated for
with
[
5].
3. Maxwell’s Demon in Financial Markets
The history of financial markets shows us that EMH often does not hold true [
6]. Andrew Lo introduces the adaptive markets hypothesis (AMH) [
6] which states that statistical arbitrage is possible and that the EMH depends on the time scale. With the statistical test depicted above and working with high frequency data from the EURUSD as of 2 January 2007 (1:31 AM) to 31 December 2007 (10:59 PM), we observe that the market is inefficient for a frequency of one minute but is efficient on a daily frequency.
The statistical arbitrage of an investor reminds us of the action of a Maxwell’s demon on a thermodynamic system. Statistical arbitrage refers to trading strategies that use statistical techniques to make profit with an element of market risk reduction. Even if the link between physical entropy and information is not easy to determine, we study how an informed investor can act on financial markets with the knowledge of alternative data. In 1978, Ed Thorp’s question was not “Is the market efficient ?” but rather “How inefficient is the market ?”. Our statistical test and the Maxwell’s demon experiment are thus in line with Ed Thorp’s idea.
In this work, we study the evolution of the market entropy and the market efficiency with an informed investor acting as a Maxwell’s demon by performing statistical arbitrage.
We then work with two symbolic representations of consecutive prices. One time series, , corresponds to the successive increments in the market price. The other, , corresponds to a series of alternative data that the Maxwell’s demon is the only one to possess and which has a predictive power on . Alternative data are becoming popular among hedge fund managers and financial institutions, and comprise data from all sources including non-financial sources (web traffic, satellites, mobile devices, etc.).
We model the action of the demon in the market and simulate the prices with the following model:
With: : the weight of all the participants in the market without knowledge of alternative data;
: The weight of the Maxwell’s demon;
;
r: the risk-free rate;
: the volatility;
: the Bernoulli variable corresponding to an efficient market implied by noise traders;
;
corresponding to the action of the demon on the market (0: it is shorting the security, 1: it is buying the security)—it depends on
We have the constraint: . With and , the actualized price is a martingale. Thus, .
If we have and , we recognize the Black–Scholes model under the risk-neutral measure Q: with
For the simulations, we work with two binary sequences and , respectively, the sequence of prices on the market and the sequence of alternative data.
We first create the sequence corresponding to an efficient sequence, meaning that . Then, we create the sequence correlated to the sequence of the alternative data. The correlation between the market at instant t and the demon’s alternative data at instant t − 1 is important in the explanation of the action of the Maxwell’s demon; if the correlation is 0, then the alternative data do not allow to make predictions.
In this section, we work with . The probability available for the informed investor is: .
The demon knows the probability of a rise in the security at the moment t knowing the value of the alternative data and of the market at the moment t − 1. While a non-informed investor only knows the probability of a rise in the security knowing the value of the market at the moment t − 1.
If we have , then the demon is buying the security at the moment t if and (so ).
Brillouin’s Perspective
Brillouin made a thought experiment about Maxwell’s demon to understand whether it can operate on the thermodynamic system described by Maxwell [
7]. For the demon, if the thermodynamic system is at a constant temperature at the beginning of the experiment, then the radiation is the one of a “blackbody” [
8], thus the demon is not able to see the molecules because a blackbody absorbs all incident electromagnetic radiation. It is then unable to violate the second principle because it is unable to determine the velocity of the molecules. Thus, Brillouin introduces a source of light so that the demon can see the molecules. However, this action increases the overall entropy. Brillouin then thinks about the following cycle: Negentropy->Information->Negentropy. He then suggests an increase in entropy related to the loss of information. The Negentropy phase corresponds to the time when the demon is using light to see the molecules. It then has information to move the molecules, but after this action, the demon will have to use the light again to restart the cycle with a new phase of Negentropy.
One is then able to create an analogy with the market information (4): if the entropy of the market increases, then the market information decreases. An informed investor is then acting as a Maxwell’s demon on financial markets. The light of the informed investor is the alternative data they use, however, while using this information, they reduce the overall information and then they will have to use the alternative data again to discover new opportunities of statistical arbitrage.
For the simulations, we compute several quantities to determine the demon’s behaviors. With
, we can introduce
with
. With
. We determine the probabilistic choice system
, and the Shannon entropy is then defined by:
We are then able to introduce the demon entropy by using the same decomposition used for the market entropy. We then have:
- -
An observed prefix of probability with ;
- -
A random suffix whose conditional distribution is a Bernoulli of parameter . The sequence is thus equal to with probability and with probability . The demon entropy is then:
We can then introduce corresponding to a particular case with .
We can finally define the demon information with:
We also introduce the correlation between and informing us about the correlation for the prediction of the time series of alternative data. This correlation is important in explaining the action of the demon. If the time series of alternative data does not explain well the time series of the financial markets, then the alternative data are not useful for a Maxwell’s demon.
4. Simulations of the Maxwell’s Demon on Financial Markets
We are now able to simulate the price of an asset on financial markets with the action of Maxwell’s demon. We take
and
, which is reasonable considering the Kyle’s model and the informed investor’s schizophrenia [
9]. This schizophrenia corresponds to the fear that informed investors have about their information. In Kyle’s model, we understand that informed investors only use half of the information they have to avoid giving their information to the other participants. We compute the market information with
and
. We performed the simulation for 1000 business days. For the first 200 days, the Maxwell’s demon does not invest in the financial security
. The vertical lines in
Figure 1a,b represents the separation between the moments when
and
. We have
. We take
and
.
We can see from
Figure 1 that the demon’s information is strongly correlated with the correlation between
and
. The correlation between the time series in
Figure 1b,d is 0.92. This explains the importance of the correlation between the alternative data and the market price: if the correlation decreases, then the informed investor is unable to predict well the future return, so its information decreases. As in Brillouin’s perspective, if the intensity of light is not sufficient to determine the position and speed of molecules, then the Maxwell’s demon is not able to move the molecules in order to change the entropy of the thermodynamic system.
We could also think that after the demon’s first transaction on the market, the market information would decrease in order to reach better market efficiency. Even if this pattern seems to be true on
Figure 1a, it is not always the case. However, the mean of the market information is lower when the demon invests than when it is passive. As in physics, the second law of thermodynamics is statistical, so a decrease in the market information does not always happen. However, in the long term, we always observe a decrease in the market information (and thus an increase in market entropy) as in physics with an increase in the entropy of the thermodynamic system.
Another interesting analogy to mention is that, in physics, the entropy of two systems is higher than the entropy of each of them. This means that we cannot create order by adding disorder to disorder.
Figure 2 shows this property comparing the entropy of the system including the Maxwell’s demon and the entire financial market to, respectively, the entropy of the financial market (
Figure 2a) and the entropy of the time series of alternative data (
Figure 2b).
We can observe in
Figure 2 that the entropy of the system {Maxwell’s demon+financial market} is higher than the entropy of the system {financial market} and higher than the system {Maxwell’s demon}. Then, an investor using alternative data is always able to send entropy to the financial markets and make the market more efficient by reducing the market information.
If the correlation between
and
is low, then the demon’s information is in the interval of the absence of information. The demon is then unable to adequately predict the future behavior of the market because we have
near from
.
Figure 3. represents the simulation of the evolution of the demon’s information (
Figure 3a) with the evolution of the correlation between
and
(
Figure 3b) with a value of correlation between
and
lower than in
Figure 1. We remind that
has a predictive power on
, then a lower correlation between these two values means that the time series of alternative data cannot give enough information to the investor using the data.
In
Figure 3, we observe that the correlation is nearer to 0 than in
Figure 1 and that the demon’s information prevents them investing in financial markets with a strong probability of gain (
near from
). We can also observe that the correlation between the demon’s information and the correlation between
and
is lower, and that the correlation between the time series in
Figure 3a,b is 0.07. Thus, a lower correlation between
and
implies lower information for the demon and a lower explainability of the correlation between
and
for the demon. Thus, the alternative data he uses are less informative.
We also observe in
Figure 1 that the market information does not stay above the
confidence bound for a long period of time. If the market is inefficient (meaning arbitrage is possible), then the participants and Maxwell’s demon take advantage of this situation and the market quickly becomes efficient. This cycle reminds us of Brillouin’s cycle, meaning that, after a peak in information we lose it by intervening in the system.