The Impact of the Prior Density on a Minimum Relative Entropy Density: A Case Study with SPX Option Data

We study the problem of finding probability densities that match given European call option prices. To allow prior information about such a density to be taken into account, we generalise the algorithm presented in Neri and Schneider (2011) to find the maximum entropy density of an asset price to the relative entropy case. This is applied to study the impact the choice of prior density has in two market scenarios. In the first scenario, call option prices are prescribed at only a small number of strikes, and we see that the choice of prior, or indeed its omission, yields notably different densities. The second scenario is given by CBOE option price data for S&P500 index options at a large number of strikes. Prior information is now considered to be given by calibrated Heston, Schoebel-Zhu or Variance Gamma models. We find that the resulting digital option prices are essentially the same as those given by the (non-relative) Buchen-Kelly density itself. In other words, in a sufficiently liquid market the influence of the prior density seems to vanish almost completely. Finally, we study variance swaps and derive a simple formula relating the fair variance swap rate to entropy. Then we show, again, that the prior loses its influence on the fair variance swap rate as the number of strikes increases.


Introduction
Many financial derivatives are valued by calculating their expected payoff under the risk-neutral measure. For path-independent derivatives the expectation can be obtained by integrating the product of the payoff function and the density. If a pricing model has been chosen for a market in which many derivative products are actively quoted, then often, due to the limited number of model parameters, this model will be unable to perfectly match the market quotes, and a compromise must be made during model calibration by using some kind of "best-fit" criterion.
If no model has been chosen, one can try to imply from the market data for a given maturity a probability density function that leads, by integration as described above, exactly back to the quoted prices. However, unless the market is perfectly liquid, there will be infinitely many densities that match the price quotes, and some criterion for the selection of the density will have to be applied. One such criterion is to choose the density that maximises uncertainty or, in another word, entropy. The idea is that between two densities matching the constraints imposed by the market prices, the one that is more uncertain -where "uncertain" means, very roughly speaking, spreading probability over a large interval instead of assigning it to just a few points, where possible -should be chosen. In general, applying the criterion of entropy delivers convincing results. For example, on the unit interval [0, 1], if no constraints are given, the density with the greatest entropy -the maximum entropy density (MED) -is the uniform density. On the positive real numbers [0, ∞[, if the mean is given as the only constraint, the entropy maximiser is the exponential density. There is no entropy maximiser over the real numbers R, but if the mean and variance are imposed as constraints, then the density with largest entropy is a Gaussian normal density.
The concept of entropy has its origins in the works of Boltzmann [5] in Statistical Mechanics and Shannon [33] in Information Theory, and an important recent application has been by Villani [35] and others in the field of Optimal Transport. In Finance, too, entropy has become a popular tool, as a survey of recent literature (see for example [3,7,8,9,18,19,21,31]) confirms.
A third approach, which we will use to combine the two approaches described above, is to take a density p, which may be near to matching some imposed constraints, as a prior density and to then find a density q that is "as close" as possible to p and exactly matches these constraints. The criterion we will employ to measure the "distance" between the densities q and p is that of relative entropy. Since we are trying to depart as little as possible from the prior p, our goal will be to find the minimum relative entropy density (MRED) q that matches the given constraints. Of course, if the prior p already matches the constraints, then we can take as our solution q = p, since the relative entropy of p with respect to itself is zero. Relative entropy was introduced by Kullback and Leibler [24] and is also known as the Kullback-Leibler information number I = I(q p) or I-Divergence [15]. Although it is always nonnegative and can be used as measure of distance, it is important to stress that it is not a metric in the mathematical sense, since usually I(q p) = I(p q), and the triangle inequality is not satisfied.
In the study we carry out in this paper, the prior density function of the asset price for a fixed maturity is given by a model, such as the Black-Scholes model or the Heston stochastic volatility model. Depending on the model in question, this prior density will be either directly available in analytical form (in the case of the Black-Scholes model a log-normal distribution), or have to be obtained numerically (in case of the Heston model via Fourier inversion). The main impact of this will be on computation time, but otherwise the difference is of minor consequence. The algorithms we propose to calculate the MRED q (with respect to p) satisfying some constraints given by European option prices are extensions of the two algorithms presented in [28] and [29].
In the first case, the option data consists of call and digital call prices (section 3), and in the second case only of call prices (section 4). If only the call prices are imposed, say n of them, the problem consists in finding the minimum of a real-valued, convex function (the relative entropy function) in n variables. If one additionally imposes the n prices of digital options at the same strikes, the problem simplifies to a sequence of n one-dimensional root-finding problems. The multi-dimensional algorithm makes use of the single-dimensional one by fixing the set of call prices, defining a parameter space Ω for arbitrage-free digital prices, and then finding the unique density in this family with the smallest relative entropy.
The models we take to generate our prior densities are presented in section 5, together with a review of the characteristic function pricing approach and corresponding Fourier transform techniques. In addition to the two models already mentioned above, we also consider the Schöbel-Zhu stochastic volatility model and the Variance Gamma model.
In section 6 we study two market scenarios. In the first one, we take a log-normal Black-Scholes and a Heston density as our priors and calculate the MREDs that match given call prices. We then compare option prices to those obtained with an MED, and also to those obtained with another log-normal density that matches the constraints, and observe that the price differences can be substantial. In the second scenario, we take S&P500 call option prices from the CBOE. This market is very liquid, and for the maturity we consider we have quotes for a large set of strikes. We calibrate a Heston, Schöbel-Zhu and Variance Gamma model to this data and use the densities generated by these models as prior densities. Then, we calculate the three MREDs for these priors, and compare the digital option prices they give to those given by the original models, those given by an MED, and finally the market prices themselves, which are available in this case. We observe that it makes almost no difference which model is chosen for the prior, and that all three MREDs essentially agree with the MED.
In section 7 we study variance swaps and the fair swap rate. Assuming that the underlying asset follows a diffusion process without jumps, it is possible to relate this rate to the price of a log-contract. A formula linking it to an integral over call and puts prices at varying strikes is also well known ( [10,16,20]. Here, we establish a simple formula (Corollary 7.2) that relates the fair variance swap rate to entropy. We then give an explicit formula (see equation 37) for the fair variance swap rate in the case of a (nonrelative) MED in terms of the assumed drift rate and the density's parameters. In the relative entropy case, we calculate the fair rate numerically and show that for MREDs constrained by data at very few strikes the prior density can have a significant impact on the fair rate. However, as in the examples given in section 6, the impact of the prior density diminishes quite strongly as data at more strikes are added as constraints. Finally, section 8 concludes the article.

Relative Entropy and Option Prices
In this section we review the concept of relative entropy, which can be regarded as a way of measuring the "distance" between two given densities. Our goal is to apply this measure to the following problem: Given a prior density p, coming for example from a model that fits well, but not exactly, European option prices observed in the market, how can we deform this density in such a way that it exactly matches these prices, but stays as close as possible to the original density under the criterion of relative entropy?

Relative Entropy
For two probability distributions Q and P the relative entropy of Q with respect to P is defined by where ∂Q/∂P is the Radon-Nikodým derivative.
From the inequality S ln S ≥ S − 1 we have We also have H(Q P ) = 0 if, and only if, Q = P . However, relative entropy is not a metric since, in general, H(Q P ) = H(P Q), and the triangle inequality is not satisfied either. Even the symmetric function H(Q P ) + H(P Q) does not define a metric, since it still does not satisfy the triangle inequality [13].
The Csiszár-Kullback inequality [2,14] relates relative entropy to distance between densities in the sense of the L 1 (0, ∞) norm: is the same definition as (1) above in terms of densities, which means in particular that convergence in the sense of relative entropy implies L 1 -convergence.

Minimizer Matching Option Prices
We now give a precise formulation of the minimisation problems that we want to solve. Let p be the prior density on [0, ∞[ which is assumed to be strictly positive almost everywhere.
For a fixed underlying asset and maturity T , we are given undiscounted pricesC 1 , ...,C n of call options at strictly increasing strikes K 1 < · · · < K n . For notational convenience, we introduce the "strikes" K 0 := 0 and K n+1 := ∞ and make the convention thatC 0 is the forward asset price for time T andC n+1 = 0.
In section 4 we will determine a density q for the underlying asset price S(T ) at maturity which minimises relative entropy H(q p) under the constraints Before that, in section 3, we shall assume that undiscounted digital option pricesD 1 , ...,D n on the same asset, maturity and strikes are also given and we look for q that, in addition, verifies the constraints Again, for ease of notation, we make the convention thatD 0 = 1 andD n+1 = K n+1Dn+1 = 0.
Notice that the constraints (3) and (4) for i = 0 are consistent with the fact that q is a density (its integral is 1) and the martingale condition

Minimizer Matching Call and Digital Option Prices
In this section we review some results stated in [28] and provide the base arguments required to prove them in case a prior density p is given and call and digital options prices are prescribed.
In addition, we show how the algorithm presented in [28] can be efficiently implemented. We do not assume that p is given analytically, and therefore the implementation requires numerical integration. However, the availability of the digital prices allows for an efficient solution locally in each "bucket", i.e., interval [K i , K i+1 [, via a one-dimensional Newton-Raphson rootfinder.
Formally applying the Lagrange multipliers theorem, as in [9], it can be "proven" 1 that if q minimises relative entropy in respect to p, then the Radon-Nikodým derivative g := ∂Q/∂P = q/p is piecewise exponential. More precisely, on each interval [K i , K i+1 [ the density q is given by where α i , β i ∈ R, α i > 0 are parameters that still have to be determined using the following two constraints imposed by the option data, which are an equivalent reformulation of the constraints (3) and (4) given above, but allow for an easy solution.
The first constraint follows directly from (4) and is given by from which we have Ki+1 Ki e βiS p(S)dS ∀i = 0, ..., n.
The second constraint also follows directly from (3) and (4) and is given by Notice that the right hand side of the equation above is the undiscounted price of an "asset-or-nothing" derivative that pays the asset price itself if it finishes between the two strikes K i and K i+1 at maturity and zero otherwise. Substituting α i from (6) gives which we use as an implicit equation for β i .
Later we shall rigorously show that, under non-arbitrage conditions, such α i and β i (for i = 0, ..., n) exists and that q given by (5) is indeed a relative entropy minimiser with respect to the prior density p but, firstly, we need some preliminary definitions and results.
We define the cumulant generating functions c 0 , ..., c n , from R to R ∪ {∞}, by Notice that c i (β) < ∞, for i < n and β ∈ R, since p ∈ L 1 (K i , K i+1 ) and the exponential function belongs to L ∞ (K i , K i+1 ). For i = n, the integral is over [K n , ∞[ and we can have c n (β) = ∞. However, c n (0) < ∞ and if eβ S p(S) belongs to L 1 (K n , ∞), then so does e βS p(S) for β <β. Therefore, the interior of c i 's effective domain is an interval of the form ] − ∞, β * [ for some β * ≥ 0 and, for i < n, β * = ∞. Proposition 3.1 For i < n, c i is twice differentiable and strictly convex in ] − ∞, β * [. Moreover, its first and second derivatives are given by and The differentiability of c i and c ′ i together with (9) and (10) follows immediately. Now we shall prove that c ′′ i > 0. We start by noticing that Hence, The left-hand side this last inequality can be rewritten as whereas its right-hand side is equal to Therefore, the inequality is equivalent to c ′′ i > 0.
and using (8) we can rewrite (6) and (7) in the simpler forms Equation (12) is easily solved for β i with the Newton-Raphson method using (9) and (10). Once the density q has been obtained in this manner, i.e., α i , β i have been calculated for i = 0, ..., n, we can calculate European option prices using numerical integration.
The next results give the existence and uniqueness of such β i and, consequently, α i .
Proof We shall consider only the limit when β → ∞ since the other is treated analogously. Choose L and M such that K < L < M < K i+1 . Then, we have where C > 0 does not depend on β. Since S 1 − S 2 ≤ K − L < 0, the result follows.
Proof Here again, we only consider the first limit since the other is treated analogously. For all Using the previous Lemma, we obtain that the term inside square brackets goes to 1 as β → ∞. Now we shall consider the last term above and show that, by choosing a suitable K, it is as close to K i+1 as we want.
Firstly, we assume i < n. Then K i+1 < ∞ and, given a small ε > 0, we Hence, the term above goes to K i+1 = ∞ as K goes to ∞. (11), there exists a unique solution β i ∈ R of (12).
Proof Proposition 3.1 gives that c ′ i is continuous and the last Proposition states that Hence the existence follows from the Intermediate Value Theorem. Additionally, Proposition 3.1 also gives that c ′ i is strictly increasing and the uniqueness follows.
Proof This is shown just as Theorem 2.6 in [28] by using Theorem 2.5 by Csiszár also stated there.

The Special Case of Non-Relative Entropy
The non-relative entropy can be seen as special case of the relative entropy for which no prior p is given or, roughly speaking, the prior is given by Lebesgue-measure p ≡ 1. Then equation (8) reduces to the following analytic expression: ln − e βKi β for i = n and β < 0, and the first and second derivatives (9) and (10) reduce to for i < n and β = 0, for i < n and β = 0, 1 β 2 for i = n and β < 0.
Using these expressions instead of (8), (9), (10) allows for numerical integration to be avoided in an implementation in this case.

Minimizer Matching Call Option Prices
Buchen and Kelly describe in [9] a multi-dimensional Newton-Raphson algorithm to find the maximum entropy distribution (MED) if only call prices are given as constraints. The entropy H of a probability density q over [0, ∞[ is given by The minus sign in the definition ensures that H is always positive for discrete densities (where the integral sign is replaced by a sum in the definition). For continuous densities, H is usually, but not always, positive. For example, the uniform density q(S) ≡ u over the interval [0, In [29], we show how the results of [28] together with the Legendre transform can be applied to obtain a fast and more robust Newton-Raphson algorithm to calculate the Buchen-Kelly MED. The main reason the algorithm is more stable is that the Hessian matrix has a very simple tridiagonal form. In section II.A of their paper, Buchen and Kelly also consider the case of "minimum cross entropy" (which we call relative entropy here) for a given prior density.
We now show how essentially the same algorithm as that in [29] can be applied to the relative entropy case. The next proposition consolidates and generalises the results of section 4 of [29], describing the entropy H, the gradient vector and the Hessian matrix to the case in which a prior density p is given.
Arbitrage free digital prices must lie between left and right call-spread prices, i.e., where the rightmost quantity for i = n must be read as zero.
We introduce the set Ω ⊂ R n of allD = (D 1 , ...,D n ) ∈ R n verifying (14). Note that Ω is an open n-dimensional rectangle. Define qD as the density obtained as in Theorem 3.5 for given (undiscounted) digital pricesD.
Proposition 4.1 For allD ∈ Ω the relative entropy of qD with respect to p can be expressed as where c * i is the Legendre transform of c i , p i :=D i −D i+1 andK i is given by (11). As a function of digital prices, H : Ω → R is strictly convex, twice differentiable and, for allD ∈ Ω, with α i and β i given by (12) and (13), for all i = 0, ..., n.
In addition, the Hessian matrix at anyD ∈ Ω is symmetric and tridiagonal with entries given by Proof Let gD := qD/p be the piecewise-exponential Radon-Nikodým derivative given in Notice that p i andK i are given purely in terms of option prices, for all i = 0, ..., n, and so is H(qD p). Notice also that if the prior density p already matches the call prices, then α i = 1 and β i = 0 for all i = 0, ..., n. From the relationship c * i (K i ) = β iKi − c i (β i ), it follows that, in this case, c * i (K i ) = −c i (β i ). Since ln p i = ln α i + c i (β i ) = −c * i (K i ), the proposition above gives that H(qD p) = 0, as expected. The expression for the derivative of H gives that ifD minimises H (i.e.,D is a root of the gradient of H), then gD is continuous. Furthermore, the MRED qD = gDp has the same points of discontinuity as p.
Using these last results, essentially the same Newton-Raphson algorithm as in [29] can be applied to find the relative entropy minimiser q. The only differences are that the functions c ′′ 0 , ..., c ′′ n must be replaced by their relative entropy versions (10) in the Hessian matrix of Proposition 4.1, and that in each iteration step, for a given set of digital prices, the algorithm of Section 3 must be used to calculate the MRED, instead of its non-relative version.

Probability Densities for Characteristic Function Models
In this section, we look at four models that are popular in equity derivatives pricing. Our aim is to use the densities they give for the stock price at a fixed maturity as prior densities. In two of the models we have chosen, the Black-Scholes model and the Variance Gamma model, the density is analytically available. In the other two, the Heston and the Schöbel-Zhu stochastic volatility models, it is not. We therefore give a brief overview of these models and show how to calculate their densities in each case.
Letp be the density of x(T ) := ln S(T ). To simplify notation, we will usually write x and S, respectively, when it is clear from the context that we have fixed the maturity T . Then the density p of S itself is given by is known, as in the Heston [22] or Schöbel-Zhu [32] stochastic volatility models, then p can be obtained via Fourier inversion:p Sincep is a real-valued function, it follows from (15) that φ(−u) = φ(u), and we havẽ where ℜ[z] = (z + z)/2 denotes the real part of a complex number z. It can immediately be seen that an anti-derivative ofp is given byP Furthermore, it can be shown in a similar way as the Fourier Inversion Theorem itself, that lim x→−∞P0 (x) = − 1 2 and lim x→∞P0 (x) = 1 2 , and therefore the functioñ gives an expression for the distribution function.
For pricing, we use the general formulation of Bakshi and Madan [4]. This can be used for a large class of characteristic function models that contains the Heston, Schöbel-Zhu and Variance Gamma models (see section 2 in [4], in particular Case 2 on p.218). We have S > K if, and only if, x > ln K. Let represent the probabilities of S finishing in-the-money at time T in case the stock S itself or a risk-free bond is used as numéraire, respectively. From (15) we can see that φ( contains the appropriate change of measure.
The price C of a European call option on a stock paying a dividend yield d is then obtained through the formula and the price D of a European digital call option prices through where r is the risk-free, continuously-compounded interest rate.
The integrals in (17), (18) must of course be truncated at some point a, which depends on the decay of the characteristic function of the model considered.

The Black-Scholes Model
Let the parameters r, d and T be given as above, and let σ > 0 be the volatility of the stock price. In the Black-Scholes model, the logarithm x(t) := ln S(t) follows the SDE Define µ := ln S(0) + r − d − 1 2 σ 2 T . The density of x(T ) is normal and given bỹ The characteristic function ofp has a very simple form and is given by Of course it is faster to use (21) directly instead of (22) and (16), but comparing these two methods lets one measure the additional computational burden.

The Heston Model
One of the most popular models for derivative pricing in equity and FX markets is the stochastic volatility model introduced by Heston [22]. Let x(t) := ln S(t). The model is given, in the risk-neutral measure, by the following two SDE's: where dW 1 (t), dW 2 (t) = ρdt. The variance rate v follows a Cox-Ingersoll-Ross square-root process [12].
The parameter λ represents the market price of volatility risk. Since we are only interested in pricing, we always set λ = 0 in what follows (see Gatheral [20], chapter 2).
Heston calculates the characteristic function solution, but as pointed out in [32], there is a (now well-known) issue when taking the complex logarithm. To be clear, we therefore give the formulation of the characteristic function that we use. Introducing the characteristic function of x(T ) is then given as Since implementations of the complex square-root usually return the root with non-negative real part (d 1 ), the key is simply to take the other root (d 2 ), as is done in equation (25). As shown in [1] and [25], this takes care of the whole issue.
The characteristic function for this model is given by Schöbel and Zhu in [32]. As Lord and Kahl point out ( [25], section 4.2), similar attention has to be paid when taking the complex logarithm in this model's characteristic function as in Heston's. By directly relating the two characteristic functions (eq. 4.14), they show how the Schöbel-Zhu model can also be implemented safely. In the case study in 6.2 presented in the following section with SPX option data and a maturity of less than half a year, however, we observed no problems with the characteristic function originally proposed by Schöbel and Zhu.

The Variance Gamma Model
The Variance Gamma (VG) process was introduced in [26], [27]. The density for x = ln S(T ) is given explicitly in Theorem 1 in [26]. Define Then the densityp is given bỹ where Γ is the Gamma-function and K is the modified Bessel function of the second kind.
If the parameter ν is set to zero, the characteristic function of p reduces to the Black-Scholes one given in (22). Otherwise, it is given by Lord and Kahl show ( [25], section 4.1) that this formulation of the characteristic function is safe.
Again, as with the Black-Scholes model, since both the density and the characteristic function are available, it is possible to compare the two different methods (29) vs. (30) and (16). 6 Two Numerical Examples

A Fictitious Market and Black-Scholes and Heston Prior Densities
In our first example, we take a hypothetical market with r = d = 0, T = 1, S = F = 100, in which call option prices are given by the Black-Scholes formula with volatility σ = 0.25. As prior densities, we take We calculate the Buchen-Kelly MED (Lebesgue prior), using the algorithm presented in [29], and the two Buchen-Kelly MREDs with priors p BS and p H , using the generalised algorithm presented in section 4, and compare the resulting call and digital option prices to see the influence of the priors.
The different call and digital option prices are reported in table 1. The densities were calculated using call prices at strikes • K 0 = 0, K 1 = 60, K 2 = 80, K 3 = 100, K 4 = 120, K 5 = 140 as respective constraints. These strikes are the ones in boldface in table 1.
We see that in the first case, where we had only the forward and an at-the-money call option as constraints, there are significant differences in both call and digital prices. The presence of the lognormal prior density makes MRED BS call prices cheaper compared to the original BS prices. Of course, under the prior density itself, call prices were cheaper because of the lower volatility, and this effect seems to persist. We also see that the fatter (exponential) tails of the MED translate into higher prices of deeply in-or out-of-the-money call options when compared to the other two densities.
However, as we add call prices at more strikes as constraints, the differences become smaller. In the third part of table 1, the prices of both call and digital options are clearly converging.

SPX Option Prices and Heston, Schöbel-Zhu and VG Prior Densities
In our second example, we look at call (ticker symbol SPX) and digital (BSZ) options on the Standard and Poor's 500 stock index [17]. The market data is from 18 July 2011. We consider those options which expire on 17 December 2011 and calibrate a Heston, Schöbel-Zhu and VG model, respectively, to call prices for 15 strikes 900, 950, ...1550, 1600 at constant intervals of 50 using a Levenberg-Marquardt least-squares method ( [30], [11]). The model parameters we obtained are given in table 2. 0.0421 0.1887 n/a ν n/a n/a 0.3638 Figure 1 shows the market implied volatility skew and the volatility skews generated by the three models. Apart from the last two strikes at 1550 and 1600, the fit looks quite good in all three cases: Using formulas for the densities directly, if available, or otherwise numerical inversion (16), we plot the densities for S(T ) given by these models in figure 2: The Heston and Schöbel-Zhu densities are almost indistinguishable from one another, whereas the VG density has a somewhat different shape with a slightly thinner right tail.  Table 3 shows the sum of squared errors between the market (SPX) implied volatilities and the model implied volatilities. The relative entropy H(q p) can be seen as an alternative measure of fit, since by (2) it measures how much the prior density p needs to be deformed to obtain a density q that perfectly matches the given market data. Interestingly, the Heston model fits best under either criterion, but the order of the Schöbel-Zhu and VG models is reversed in the two cases.
Finally, digital prices are reported in table 4. There are noticeable differences between market prices (although these must be taken with a pretty big pinch of salt due to the poor liquidity and large bid-ask spreads), the Buchen-Kelly prices and the prices given by the three models. However, regarding the three relative entropy distributions obtained using the different model priors, it seems that the effect of

Variance Swaps
In this section we recall the definition of a variance swap and the pricing formula based on replication through a log contract. (For more details see [10,16,23] and the references therein.) A variance swap is a forward contract on the annualized realized variance of the underlying asset over a period of time. More precisely, given observation dates t 0 < · · · < t m , the realized variance is defined by where S(t) denotes the spot price of the underlying asset at time t. The number 252 above is the annualization factor and reflects the typical number of business days in a year. The payoff of a variance swap is given by N · (σ 2 real − K var ), where K var is the strike price for variance and N is the notional amount of the swap.
Assume that {S(t)} t≥0 follows a stochastic differential equation Typically, K var is such that the theoretical price of the variance swap is null at inception and, in this case, it is said to be the fair variance swap rate and denoted by σ 2 fair . The theoretical realized variance over the period [0, T ], and so σ 2 fair , is given by We shall now derive a formula for σ 2 fair based on the price of a log contract, that is, a derivative whose payoff at maturity T is ln S(T ). Let x(t) := ln S(t) and apply Itô's formula to obtain Subtracting (32) from (31) gives Integrating from 0 to T and multiplying by 2/T gives Finally, taking expectations yields since is the price of a log contract.

Maximum Entropy and Variance Swaps
In this section we shall derive a relationship between the fair swap rate of a variance swap and the entropy of the underlying asset density. This relationship follows from another one relating the entropies of the density q of a random variable S and the density of x := ln S, which is the subject of the next proposition.
Proof Recall that the densities q andq are related by Hence, the change of measure dS = e x dx, gives Corollary 7.2 Consider an asset whose price S(t) at time t follows (31). Let q be the density of S(T ), for some T > 0, andq be the density of x(T ) := ln S(T ). Then the fair variance swap rate of a variance swap maturing at time T is given by Proof This follows immediately from the last proposition and (34).
When the density q of S(T ) is known, the price of a log contract E [ln S(T )] can be computed through numerical integration. Moreover, when q is the MED, that is, in the non-relative entropy case, we show how this price can be computed analytically. By definition, the expectation is given by

Numerical Examples
In the first example, the market is given as in subsection 6.1 by a Black-Scholes model with volatility σ = 0.25, with the same sets of 1, 3 and 5 strikes. Table 5 shows three quantities obtained from (nonrelative) Buchen-Kelly MEDs fitted to the forward and call prices at these strikes: the fair variance swap rate σ 2 fair , its square-root for comparison with implied volatilities, and the entropy. The average volatility σ fair and the entropy can be seen as two different measures of the dispersion of S(T ). As the number In the second example, we use the same reference market as above, but now we include a prior Black-Scholes density and calculate the MREDs matching the forward and call prices at 1, 3 and 5 strikes. The prior density is characterized by its volatility σ p . Table 6 shows that increasing σ p has the effect of increasing the fair variance swap rate of the MRED. However, we see that as we add more constraints, this effect is diminished. In the case of 5 strikes, it is barely noticeable. Note that in the case σ p = 0.25 where the prior density already matches the given constraints, the MRED is equal to the prior, and we recover the volatility of the Black-Scholes process as the square-root of the fair variance swap rate.  In the third example, summarized in Table 7, we proceed as in the second one, but now with a Heston density as the prior. The Heston parameters are the same as in subsection 6.1, i.e. κ = 1, θ = 0.04, ρ = −0.3, v 0 = 0.04, but now we vary the volatility σ of the variance and measure its impact on the fair variance swap rate of the MRED. In the case of 1 strike, we see clearly that this impact is very strong. However, we notice again that increasing the number of strikes quickly diminishes the strength of the impact.

Conclusion
In this article we generalise the algorithm presented in [29] to the relative entropy case. The algorithm allows for efficient computation of a risk-neutral probability density that exactly gives European call option prices quoted in the market, while staying as close as possible to a given prior density under the criterion of relative entropy.
It is not necessary to have an analytic expression for the prior density in question. In practice, several popular equity and FX models work through their characteristic functions and numerical Fourier inversion techniques. We pick two of these as examples, namely the Heston and the Schöbel-Zhu stochastic  volatility models, and show how they nevertheless can be used to provide the prior density and incorporated into our algorithm. In other cases, analytic expressions for the density are available, such as for the Black-Scholes model and the Variance Gamma model 2 , and we also incorporate these into our analysis.
As an application, we study the impact the choice of prior density has. In a first, purely hypothetical scenario, we assume that only the prices of a few options are quoted. We observe that using a prior density does indeed lead to significantly different option prices when compared to pricing with a pure log-normal density or a piece-wise exponential Buchen-Kelly density.
In a second scenario we use option price data for S&P500 index options for a fixed maturity traded on the CBOE. We calibrate three different models to these data and observe that, although the models generate noticeably different digital option prices, the prices obtained when using minimum relative entropy densities, with these models for the prior densities, agree almost perfectly. Furthermore, these prices are essentially the same as those given by the (non-relative) Buchen-Kelly density itself. In other words, in a sufficiently liquid market the effect of the prior density seems to vanish almost completely.
We also study variance swaps and establish a formula that relates their fair swap rate to entropy. In the case of MEDs, we give an explicit formula for the fair swap rate. In the case of MREDs, we study the impact the prior density has on the fair swap rate and see that, again, while it has a substantial effect when constraints exist at only a very small number of strikes, this effect diminishes rapidly as more constraints are added.