Information Theory and the Pricing of Contingent Claims: An Alternative Derivation of the Black–Scholes–Merton Formula

: This paper seeks to determine the best subjective probability to use to carry out expectation values of uncertain future cash ﬂows with the smallest number of assumptions. This results in the unique distribution that guarantees no more information is present other than the stated assumptions. The result is a novel derivation of the well-known Black–Scholes equation without the need to introduce high-level mathematical machinery. This formalism ﬁts nicely into introductory courses of ﬁnance, where the value of any ﬁnancial instrument is given by the present value of uncertain future cash ﬂows.


Introduction
Early on in finance, we learn that the fair value of a security is the present value of expected future cash flows: where Z(t, T) is the price of a zero-coupon bond at time t, maturing at time T.Many of the formulae for fixed-income security pricing result from Equation (1) when the cash flows c(T i ) are known with certainty Fabozzi and Mann (2021).However, when the cash flows are uncertain, it is natural to ask what probabilities should be used to determine the average implied by the expectation value in Equation (1).
The overwhelmingly accepted answer is, of course, the famous "risk-neutral probabilities" introduced by Black and Scholes (1973), in which many assumptions were made in order to justify the approach.The assumptions were those of the capital asset pricing model (CAPM) Lintner (1965); Sharpe (1964), which depends upon general economic equilibrium and also upon secondary assumptions such as a constant interest rate (or at least non-stochastic interest rate) and the variance of the stock over the lifetime of the option.
Robert Merton did not agree with many of these assumptions, especially the condition of economic equilibrium, saying of the derivation, "the portfolio weights are chosen to eliminate all 'market risk'.By the assumptions of the CAPM, any portfolio with zero ('beta') market risk must have an expected return equal to the risk-free rate.Hence an equilibrium condition is established between the expected return on the option, the expected return on the stock and the risk-less rate" Merton (1973).Merton then set out to relax some of these assumptions by adding stochastic interest rates and, more importantly, proving that the expectations of the investors play no part in the fair value of an option.This insight came from a precise cancellation of terms when solving the equations for the portfolio to be arbitrage-free, that is, the initial capital investment is all that is required to hedge the portfolio without any inflows or outflows; the gains and losses of the portfolio suffice.Thus, the equilibrium assumptions of the CAPM are not required.
These equations are now known as the "market price of risk" and form the basis of dynamic asset pricing theory, as developed by Harrison and Pliska (1981), who developed the fundamental theorem of asset pricing by specifying the mathematical and economic requirements for arbitrage freedom in a given financial market.
The mathematical machinery needed to price a simple European option is, therefore, quite complex, requiring the knowledge of stochastic calculus, dynamic asset pricing theory and the solution of partial differential equations-very intimidating for a young student in finance.This paper seeks to determine a financially justified probability distribution without relying on any of this advanced mathematical machinery.The frameworks used are those of Information Theory Shannon (1948) and the principle of maximum entropy developed by Jaynes (1957bJaynes ( , 2002)).

Information Theory and Maximum Entropy
In 1948, Claude Shannon introduced Information Theory in his landmark paper "A Mathematical Theory of Communication" Shannon (1948).Here, Shannon introduced the concept of informational entropy of a discrete probability distribution: 1 as a measure of information.It has intuitive properties that the information about an event should satisfy, such as being zero when the outcome is certain-i.e., only one of the probabilities is 1.0-and achieves a maximum when all outcomes are equally likely.From this fundamental quantity, he determined optimal ways to encode communication messages into abstract codes, where the environment can adversely affect the message (i.e., what is transmitted is not what is received).Later, in 1957, Ed Jaynes noted that the definition of entropy in statistical physics could be thought of as informational entropy Jaynes (1957a) and introduced a way to generate a probability distribution that produces the empirical data without adding any additional assumptions, i.e., a distribution that "...has the important property that no possibility is ignored; it assigns a positive weight to every situation that is not absolutely excluded by the given information".
We can apply this concept in finance, as it is an open question whether asset price fluctuations can have an objective probability distribution.The maximum entropy framework does not assume one, whereas an objective probability distribution is required by the Black-Scholes-Merton framework-all models that attempt to go beyond the Black-Scholes-Merton assumptions (Cox and Ross 1976;Dupire 1994;Heston 1993;Merton 1976).Stocks are not physical particles subject to the immutable laws of physics; they are, rather, a human construct with prices traded in a market driven by human emotion.Thus, we seek to obtain a subjective probability distribution that not only relies on all of the information that exists in the market, but is "maximally noncommital to missing information" Jaynes (1957a), meaning this distribution, once found, does not (cannot) contain any further assumptions than those that are used in its derivation.In this paper, we show that only two assumptions are required to derive this unique distribution: 1.
There is a forward contract on the stock, which is fairly priced in the market; 2.
The distribution has a variance; otherwise, statistical measures of risk are difficult to quantify.
Using only these assumptions, the subjective probability can be determined.
In the first section of this paper, we use the method of Lagrange multipliers to apply the constraints and derive the subjective probability.In the second section, we use this unique probability distribution to determine the price of a European call option and show that the Black-Scholes equation is obtained.

Subjective Probability
We assume that there is a liquidly traded market in forward contracts for this stock.A forward contract is a contract struck at time t, where both parties are obliged to trade the stock at a later time T for an agreed-upon price, regardless of whether or not the buyer could obtain a lower price in the market at that future time.We denote the forward price by F(t, T).
Suppose that the forward price can be obtained by summing over all possible future values that the stock could possibly attain, weighted by a probability We can think of this probability as containing the totality of information that all market participants collectively possess.It is this probability that we seek to determine.
The second piece of information reflects the fact that the market participants know, almost certainly, that the stock price will not exactly attain the forward price at time T; there is some uncertainty in the future outcome.They must decide on a measure of uncertainty in the forward price F(t, T), and, more importantly, they must decide how this measure of uncertainty is determined and quoted.Once chosen, the participant can calculate the subjective distribution, which represents the least-biased estimate based on these two pieces of information.By following this prescription, we show how to arrive at the Black-Scholes formula.
To motivate the difference between this approach and the traditional approach, we examine what we mean by a subjective probability distribution rather than an objective one.By subjective, we mean "the sense that it describes only a state of knowledge, and not anything that could be measured in a physical experiment" Jaynes (2002), rather than a proscriptive probability distribution.For instance, one could posit a stochastic differential equation for the underlying stock price: which a priori describes the dynamics of the stock price and, in principle, gives complete information on the statistics of the movements.A posteriori, these statistics can be empirically tested and (4) can be augmented (or rejected) if need be.
On the other hand, subjective probability distributions do not proscribe any dynamics or constrain the system at all; they are an inferential tool used when the underlying dynamics of the system are too complex to determine (or do not even exist at all).
To begin, we specify that volatility is a measure of the distribution of returns on the underlying asset.Mathematically, we write that the variance is an average of the square of the logarithm of the future underlying price relative to the current price: (5) By using the maximum entropy framework to determine the probabilities, we guarantee that the results do not implicitly contain any further assumptions or biases.
The derivation of the subjective distribution proceeds by maximization of the Shannon entropy Shannon (1948): subject to the constraints These particular constraints enforce normalization of the probability, i.e., that the forward price will be priced correctly (the functional form of µ will be fixed a posteriori to enforce the average (3)) and that the volatility is given by Equation ( 5).
Functional maximization subject to constraints is carried out by the method of Lagrange multipliers: leading to the probability The Lagrange multipliers are determined by the set of constraint functions {g i = 0}, i = 1, 2, 3. Instead of fixing λ, we introduce the "partition function" Z ≡ exp(1 + λ), where With this definition, the probability in (11) becomes In order to make contact with the Black-Scholes derivation, we make the substitution ∑ α → ∞ 0 dx.The partition function now has a closed-form solution: The further two constraints {g i = 0}, i = 1, 2 are enforced by choosing the γ i s such that These equations have solution With the constraints enforced, the probability in (11) becomes The next step is to identify the averages v(t, T) and µ(t, T) with their corresponding financial parameters.The realized volatility is identified as the integral of the volatility term structure that the market participants wish to use to model the underlying v(t, T) = T t σ 2 (s)ds; (20) the maximum entropy distribution does not limit this choice. 2 The average quantity µ(t, T) is fixed by the forward price (3): which is an implicit equation fixing µ(t, T) in terms of the forward price F(t, T) with solution We are now in a position to price any contingent claim armed with the probability distribution In particular, we can now price the European call option by explicitly calculating the present value of the uncertain future cash flow using the probability distribution that is guaranteed to not have any more information, or assumptions, than the two that we imposed: it prices forward prices exactly, and the distribution has a variance.

Re-Deriving the Black-Scholes Formula
We look at the simplest security with uncertain cash flows, that of a European call option on a stock with a current traded market price S(t).The option gives the buyer the right, but not the obligation, to purchase the stock at the strike price K at some point in the future time T > t.That is, we seek to determine the unique value of C(t, T) that is the present value of the uncertain future cash flow: Inserting the subjective probability (23) into this payoff formula, i.e., the price can be calculated exactly: where N (x) is the cumulative normal distribution and Equation ( 26) is the celebrated Black-Scholes formula for the price of a European call option with a volatility term structure.

Conclusions
The standard way of teaching the pricing of contingent claims in finance classes relies on the Black-Scholes-Merton framework, stochastic calculus and dynamic asset pricing theory.While mathematically sound, this approach has two major pedagogical drawbacks.First, the student may not have been exposed to such advanced mathematics, and second, more fundamentally, a stock price may not have an objective probability distribution such as, say, a diffusing pollen particle.Stocks are human creations and their trading depends on human emotion.Further, if a model becomes accepted in the market, this can fundamentally alter the way an asset trades.
In this paper, we have derived the same equation without the use of stochastic calculus or dynamic asset pricing theory; therefore, this approach could be better suited to early finance courses.Furthermore, the Black-Scholes equation is derived with only two assumptions: that a forward contract is traded in the market, and that the probability distribution used has some measure of dispersion and, importantly, a guarantee that no other information has been used.A recent paper extends the analysis in this paper by using the maximum entropy formulation and all available option prices to infer higher moments of the distribution Ardakani (2022).
Although we do not believe this supplants the standard treatment of pricing-contingent claims, this method offers a new perspective on the age-old problem, and one that fits easily into an introductory framework of finance without the need to complicate matters by the introduction of advanced concepts such as filtrations, stochastic calculus and numeraires.
Funding: This research received no external funding.