Let us start our analysis with martingale tests based on regression analysis. Given a process
, the martingale property requires that
for
, and more specifically, when
is the sigma algebra generated by the process
:
The linear regression analysis gives the conditional expectation in terms of a linear combination of
’s
Should the result of the linear regression correspond to the martingale, the coefficients
should be statistically zero for
and
should be statistically one. This leads to the following hypothesis test:
Variants of this test have appeared in the previous literature, see for instance
Croxson and Reade (
2014). In our own text, we limit ourselves to perform this test only for two observations
at times
,
. Our primary focus is not on finding the exact time pairs
for which the martingale hypothesis was violated, as this could happen randomly, and no profitable trading strategy could have ever existed, even when we find a violation of this hypothesis in retrospect. Instead, we use this approach on the training data set as a possible indication of the market inefficiency, get an alternative prediction of the outcome based on the regression analysis and use the resulting optimal trading strategy to see if the corresponding trading profits are statistically significant.
In our situation, we linearly regress the market’s outcome in terms of a binary 0-1 variable on the presently quoted probability and the past quoted probability. The previously quoted probability should be statistically insignificant. A significance of this variable would indicate a possible violation of the market efficiency hypothesis. Moreover, the regression analysis gives an alternative probabilistic opinion of the outcome, based on two predictors in terms of the past quoted probability and the currently quoted probability. This new estimate will be used in the following text to test whether the alternative probabilistic view could be monetized. Note that while it could be more natural to use non-linear regressions such as logit or probit on the binary outcome, the non-linearity of these methods would prevent us from the straightforward application of the martingale test.
2.2. Spatial Martingale Test
The spatial martingale test is motivated by the standard calibration test used in the prediction markets literature (see
Horn et al. (
2014)). In the efficient market, the quoted probability should correspond to the observed frequency. Our approach differs when we take into account only the first observation when the probability is above a certain threshold. This corresponds to evaluation of the process at a stopping time. Therefore, we test whether contracts are priced efficiently after exceeding or jumping above a certain threshold for the first time. Comparing the quoted probability with the frequency of the outcomes is also problematic in the context of football probabilities, as the evolution of the odds can experience dramatic jumps due to goals and some probability levels may never be attained during the game, even for all contracts (win, loss, draw) combined. Thus, the first hitting time is more appropriate for our problem.
The novel spatial martingale test examines whether probabilities that exceed a certain threshold for the first time are the best estimate for the outcome, given all probabilities that exceed lower thresholds for the first time. Practically, this test examines whether there exists a profitable trading strategy like buying a contract that exceeds a threshold for the first time and holding it until the end of the match.
Before we can perform the regression test, let us define
as the probability observed at the hitting time
, where
is given as the first point of time that is outside the interval
:
with
. We can subsequently arrange the probabilities from the smallest to the highest threshold level. Hence, we transform the time series
with
into a stochastic process
with an increasing
. The first value of the new process is the initial probability of the time series, the second value is the first observation that is outside the interval
, the third value is the first observation that is outside the interval
, and so on. Note that
can be zero for some
s.
Now, we can test whether probabilities that exceed a certain threshold for the first time are the best estimate for the outcome, given all probabilities that exceed lower thresholds for the first time. This can be formulated in the following way:
where
is the probability at the hitting time
of contract
, market
and time
. Moreover,
is a binary variable on the outcome of the market, which is one if the event occurs, and zero otherwise.
In the spatial martingale regression test, we use the final result
as a dependent variable and
, as well as
as independent variables in the regression model,
with
.
We run separate panel regressions for all combinations of and and estimate the parameters and . Under the null hypothesis that the probabilities satisfy the martingale hypothesis, coefficient should be statistically insignificant.
2.3. Application on Soccer Data
As an illustration of our approach, we examine betting data from Betfair for the English Premier League. The English Premier League is one of the most popular football leagues globally and an important market for the online betting industry. Each season of the English Premier League plays 380 matches, where 20 teams play each other twice, once at home and once away. A match consists of two 45 min periods separated by 15 min break plus some extra injury time. Each game has a clearly defined outcome (home win, away win, or draw), based on the home team’s and away team’s final score. During the 15 min break, the odds are typically constant, as there is usually no information update. Thus, we remove the 15 min break to avoid multicollinearity issues in the regression analysis.
We focus on the most liquid market—the outcome of the game. The market offers 3 contracts: home win, away win and draw for every single match of the English Premier League season and they are quoted even during the actual game, which is called in-play. The contracts are traded in real time and are quoted in terms of odds 1 :
x. It is possible to buy an event for
$1 and receive
$x if the event occurs and
$0 otherwise, see for instance
Vecer et al. (
2009). These odds can be interpreted as the probability that the event occurs, see for instance
Wolfers and Zitzewitz (
2006).
We use the in-play minute-by-minute odds for 362 of the 380 matches of the English Premier League season 2016/17 provided by Betfair. We split the data set on 262 training and 100 testing matches ordered by date, so that an agent who would spot a violation in the market efficiency hypothesis in the training data would monetize this information in the testing data. Moreover, we separate the data into three datasets based on the three events of home win, away win and draw. Finally, for the time martingale test, we cut off the odds after minute 94 to have well balanced datasets. On the other hand, we do not cut off the probabilities after for the spatial martingale test and use all available probabilities when calculating the hitting times.
As for the independent variable, we transform the Betfair odds 1 :
into probabilities
. We set
to represent the events of the home win, the away win and the draw, where
indicates the match index and
the in-play minute-by-minute time. The probabilities of the three events should add up to one for a given match and time. However, although the markets quote the odds with only a tiny margin, we need to adjust the odds correspondingly. Therefore, the relationship between the odds can be expressed by
where
is the market margin. The probabilities of the three different events can be calculated using the following formula:
For the dependent variable, we use the final score of each match in our analysis. These data are provided by the official Premier League website,
https://www.premierleague.com/results. We define the final result of each contract
(home win, draw, away win) based on the final score of each match
, whereas
and
represent the number of goals of team home and team away:
Finally, we have three different panel datasets for every contract type , namely the home win, the away win and the draw, each with 362 matches and 95 time steps , as well as the final results for all three contracts and 362 matches.
The time martingale test examines whether the current in-play probability is the best estimate for the outcome of the match, given all prior probabilities. This is tested by the regression model (
6),
where
is the probability extracted from the football betting odds of a contract
, match
and time
. Moreover,
is a binary variable on the outcome of the match
, which is one if the event of contract
occurs and zero otherwise. Under the null hypothesis that the probabilities satisfy the martingale properties, the second coefficient
should be insignificant.
We estimate the coefficients of the time martingale regression model to test for the efficiency of the time series data using the Premier League 2016/17 minute-by-minute data with
. We run 4465 regressions for each of the three events (home win, away win and draw), as we have 4465 possible combinations of
t and
. Under the null hypothesis, the second regression coefficient
should be insignificant for all combinations of
t and
. We report the results of the regressions in
Table 1 and in
Figure 1.
Table 1 lists the number of instances when the second coefficient is significant on 5% and 1%
p-values.
Figure 1 visualizes the results of the regression analysis. The graphs represent the
p-value of the second regression coefficient in home, draw and away betting markets. Every colored point represents the
p-value of a different regression and is the combination of
at time
t on the
y-axis and
at time
s on the
x-axis. Yellow color represents
p-values close to 100% and blue color represents
p-values close to 0%.
The home win data results indicate that 110 (2.46%) of the second regression coefficients are significant at a 5% level and 20 (0.45%) are significant at a 1% level. Having a closer look at the significant coefficients shows that regressions with parameters t and s after the break and close to the end of the match are significant. The draw data set results show that 75 (1.68%) of the second regression coefficients are significant at a 5% confidence level and 12 (0.27%) at a 1% confidence level. As we illustrate in the following section, the p-value significance in training data results from random fluctuations and does not have any predictive power for the testing data set. The p-values of the away win data show similar results 840 (18.81%) of the coefficients are significant at a 5% level and 363 (8.13%) at a 1% level. Again, the regression models with parameters after the 15-minute break show significant second regression coefficients.
The spatial martingale test examines whether probabilities that exceed a certain threshold for the first time are the best estimates for the outcome given all probabilities that exceed lower thresholds for the first time. This can be tested by estimating the coefficients of the following regression model (
9),
where
is the probability at the hitting time
of a contract
, match
and time
. Again,
is a binary variable on the outcome of the match, which is one if a specific event occurs and zero otherwise. Under the null hypothesis that the probabilities satisfy the martingale properties, the second coefficient
should be insignificant.
We estimate the coefficients of the spatial martingale model to test for efficiency of the spatial process using the Premier League 2016/17 minute-by-minute data with thresholds
. The possible combinations of
and
, with
, result in 1225 regressions. Under the null hypothesis, the second regression coefficient
should be insignificant for all combinations of
and
. We report the results of the regressions in
Table 2 and in
Figure 2.
Table 2 shows the number of
p-values that indicate significant second regression coefficients at a level of 5% and 1% and the number of
p-values of the second regression coefficient that are below the 5% and 1% confidence level of the comparable
p-value distribution.
Figure 2 visualizes the results of the regressions. The graphs show the standard
p-values of the second regression coefficient for the home, draw and away markets. Every colored point represents the
p-value of a different regression and is the combination of
with the threshold
on the
y-axis and
at time
on the
x-axis.
The home win data set results show that 35 (2.86%) of the p-values indicate a significant second regression coefficient at a 5% significance level and 9 (0.73%) at a 1% significance level. More specifically, regressions with combinations of high and show p-values below 5%. This is due to the momentum of the game. High values of and indicate that the probability of winning is either very high or very low. The observed high significance of the regression coefficient corresponding to means that all the games in our training set that entered 95% probability before reaching 99% probability of winning all resulted in a win and the games that entered 5% probability before reaching 1% probability of winning all resulted in a loss. This can easily happen, as we have 262 games in our training set and the probability of a game reversal at this stage is only 1%.
The probabilities of
of the draw data set for
are almost constant. The reason is that the initial quotes of the probability of the draw are typically below 30%. Thus, we limit ourselves to run 1035 regressions with the parameters
and
. The results are in line with our previous findings. The second regression coefficient is significant 48 times (4.64%) at a level of 5% and 42 times (4.06%) at a level of 1%. The second regression coefficient is almost always significant at a 1% level for regressions with
. This is a similar phenomenon to the case of the home team win. It turns out that all games that reached 99% probability threshold ended up in win in this market and the games that reached 1% probability threshold ended up in loss in this market. Thus, we see significant coefficients of the previous quotes when
in
Figure 2 in the draw market, which corresponds to the bottom horizontal line. The significance of the previous coefficient corresponds to a momentum trading strategy. Meaning buy the bet when it reaches 99% and sell the bet when it reaches 1%. Such a strategy has a 99% success rate to start with, so this can easily happen, but drawing conclusions about the statistical significance of this trading strategy on a relatively small data set (262 games) is not appropriate as a single exception to this rule would ruin both the significance of the second regression coefficient and profitability of this trading strategy.
The results of the away win data set show that 45 (3.67%) of the coefficients are significant at a 5% level and 38 (3.10%) are significant at a 1% level. We see the same behavior of statistical significance of the second regression coefficient when , which is again caused by the fact that there was no game reversal when the market reached 99% or 1% probability quote.