Comparing Two Different Option Pricing Methods

: Motivated by new ﬁnancial markets where there is no canonical choice of a risk-neutral measure, we compared two different methods for pricing options: calibration with an entropic penalty term and valuation by the Esscher measure. The main aim of this paper is to contrast the outcomes of those two methods with real-traded call option prices in a liquid market like NASDAQ stock exchange, using data referring to the period 2019–2020. Although the Esscher measure method slightly underperforms the calibration method in terms of absolute values of the percentage difference between real and model prices, it could be the only feasible choice if there are not many liquidly traded derivatives in the market.


Introduction
Empirical studies show that real financial markets are incomplete, which means that not all derivative securities like options are replicable by means of some initial capital plus the value process of some self-finance trading strategy, and therefore hold unhedgeable and undiversifiable risks. The second fundamental theorem of asset pricing implies that in incomplete markets there typically exist several martingale measures consistent with the no-arbitrage principle that can be chosen as pricing measures (see, e.g., Jeanblanc et al. 2009;Shreve 2004). In fact, the range of option prices composes the total range between the inf and the sup of expected values with respect to martingale measures, and is therefore too large for practical purposes, see (Eberlein and Hammerstein 2004, Theorem 11.55).
Therefore, it makes sense to select a particular martingale measure for pricing purposes. In this study, we introduce, discuss, and compare two methods for the simulation of European call option prices: one based on the Esscher martingale measure and the other on the calibration to real-traded securities with an entropic penalty term. We implemented them in the programming language R and contrast the simulated derivative prices with real data retrieved from NASDAQ option chains. The objective of the paper was to provide a good fit between real and simulated contingent claim prices, consequently suggesting a choice for risk-neutral pricing measures in incomplete markets. In order to fulfill this aim, we constructed a model rich enough to describe stock prices but feasible for practice and connected to option prices. In this context, it is natural to exogenously specify different, suitable underlying asset price dynamics on a case-by-case basis.
The Esscher transform, invented by the Swedish actuary F. Esscher and introduced in Esscher (1932), is a time-honored tool in actuarial science for approximating the distribution of the aggregate claims of a portfolio. It consists of an exponential tilting procedure. More recently, it has been used as a premium calculation principle, see Van Heerwaarden et al. (1989), in option pricing, 2 of 28 see Gerber and Shiu (1994), and also in pricing defaultable assets, see Lee and Rheinländer (2012). In our financial setting, the agent has the option to invest in a financial market via admissible strategies. Hereby we exclude strategies leading to arbitrage opportunities. The corresponding theory has been developed by Kallsen and Shiryaev, see Kallsen and Shiryaev (2002). In Benth and Schmeck (2014), the authors used an Esscher transform incorporated in a two-factor model to derive futures prices in electricity markets. Furthermore, pricing methodology for weather derivatives in a specific two-factor model is established in Hell et al. (2012).
The idea of calibration is to consider a set of liquid option prices, and to find a market martingale measure such that the expected values of the option payoffs are as close as possible to the observed prices. This approximation problem is typically ill-posed, and one has to include a regularization term, we opted for an entropic penalty term in this regard. A general exposition about the calibration method can be found in Cont and Tankov, see Cont and Tankov (2004a).
Sections 2 and 3 are divided into a theoretical and a numerical part. The second section is devoted to the geometric Esscher measure and Laplace cumulant processes. In this part, we modeled the log-prices of the stock with a COGARCH process (introduced in Klüppelberg et al. (2004)) and present a sufficient condition for the existence of such a measure. Assuming the driving Lévy process to have a N IG (normal inverse Gaussian) distribution, which is infinitely divisible and thus corresponds to a Lévy process, we explain the original procedure-relying on the canonical representation theorem for semimartingales-followed by simulating the paths of the stock pricesunder the martingale measure. We pinpoint that, as Lemma 1 shows, the choice of the N IG distribution in the COGARCH model is crucial, since it allows for generating the risk-neutral trajectories of the underlying asset price.
In the third section, the calibration method is explained; using the fundamental concepts of relative entropy of distributions and Lévy processes on the Skorohod space (presented in Appendix E), we chose a martingale measure under which the dynamics of the stock prices follow an exponential Lévy process. In this part we mainly follow Cont and Tankov (2004a), even if the mathematical structure was refined and generalized, and the numerical approximation machinery was tailored to our goals.
In order to make the readers familiar with the theory and the notation that stands behind the main concepts of these two methods, in the appendix we also present a construction of the N IG distribution and some essential results on Lévy processes, highlighting the connection with the characteristics of semimartingales.
The main point of the present paper is as follows: if there are many liquid options around, like plain vanilla calls and puts on liquid stocks, we would expect the calibration method to perform best, therefore, we chose it as the benchmark method. However, in new financial markets, like insurance derivatives, cryptocurrencies, energy, or electricity markets, there are often only a few derivatives or any at all available, or the underlying (like electricity) is difficult to trade in since it is not storable. In these cases the calibration method is not implementable, and there is not much choice other than to use the Esscher method which has been done, e.g., in the book by Benth, Benth, and Koekebakker about electricity and related markets, see Benth et al. (2008). Besides that, there are few, if at all, studies about practical implementation of hedging strategies in incomplete markets, therefore our study fills a gap in this direction. Our main result thus is that while the Esscher martingale measure based pricing method in a liquid market does underperform the calibration method, as is to be expected, it does so only by less than 5%. So it might be a feasible choice of method in new financial markets.

Esscher Measure Method
In this section, the dynamics of the stock prices are introduced and sufficient conditions for the existence of the Esscher measure, which was used to simulate European call options prices, are established. We refer to Appendix D for a brief summary of the theory, which is necessary to construct the geometric Esscher measure. In the following, we use notation that can be found, e.g., in the books (Rheinländer and Sexton (2011);Shiryaev and Jacod (2003)), as well as references therein.

The Model
On a probability space (Ω, F , P) we introduce a driving, R-valued Lévy process L = {L t } t with generating triplet σ 2 , ν, γ h with respect to a truncation function h (see Appendix B and Appendix C). We endow this space with F, the augmented filtration of L. Define the dynamics of the spot prices by the process S = {S t } t , where S := S 0 exp (G) , S 0 ∈ R + , with the log-prices G = {G t } t which follow a COGARCH process, introduced in Klüppelberg et al. (2004): The volatility process σ 2 = σ 2 t t is given by with σ 2 0 ∈ R + and an adapted, right-continuous with left limits (RCLL) process X = {X t } t defined by We recall that a process is said to be RCLL if it has, almost surely, right-continuous trajectories with left limits. Although σ 2 is a left-continuous process, as shown by (2), we use σ − with the aim to highlight that we are working with a process which is adapted and left-continuous, therefore predictable. For an adapted process r = {r t } t modeling the interest rate, the discounted spot price process is given by S = S t t , where S t := exp − t 0 r s ds S t , t ≥ 0. Thus, we consider the process G = {G t } t , where G t := − t 0 r s ds + G t for every t ≥ 0, whose dynamics follow dG t = −r t dt + dG t . At this point, we can express S = S 0 exp (G ) and we want to find a process θ ∈ L (G ) such that S is a P θ -local martingale: P θ denotes the geometric Esscher measure, or Esscher martingale transform, for the exponential process S (see Appendix D). (1), it is clear that G jumps at the same times as L does, and ∆G t = σ t− ∆L t for every t ≥ 0. For t ∈ R + and ω ∈ Ω, the measure associated with its jumps is

Remark 1. From
Theorem 1. Let θ ∈ L(G ) be such that θ · G is exponentially special and Z θ is a uniformly integrable martingale. If also (θ + 1) · G is exponentially special and θ satisfies for every t ≥ 0, then S is a P θ -local martingale.
Note that G is a 1-dimensional process, thus the geometric Esscher measure is unique.
Taking into account Lemma A1 in Appendix D, which provides and considering that (σ − · L) c = σ − · L c and σ − · L c , σ − · L c = σ 2 − σ 2 dt, we obtain the characteristics of G under P with respect to h: Thus, if θ satisfies κ (θ + 1) − κ (θ) = 0, thanks to Theorem A2 we can conclude that S is a P θ -local martingale. In fact, using (A11) we have where the last equality holds by (3).
The next lemma provides us with the candidate solutions to Equation (3) when L follows a N IG distribution (see Appendix A). Lemma 1. Let L = {L t } t be a N IG-distributed Lévy process with parameters (α, β, µ, δ) and define the truncation function h ( fulfills Equation (3), then for every t ≥ 0 we have with R t := −r t + µσ t− . Proof. Since L 1 ∼ N IG (α, β, µ, δ), we apply (A6) and (A9) (see Appendices A and B) to obtain where γ 1 := 2δα Proceeding as in the proof of Theorem 1, we have Since σ − γ 1 = σ − [(θ + 1) − θ] γ 1 , we expand on the chain of equalities in (8): where the last equality is obtained by combining (5) and (7). At this point it remains to find the possible solutions for every t ≥ 0 to the equation which can be written as and repeating once again the same operation of squaring we end up with an equation of second degree in θ: Computing algebraically the solutions of (10) we get the expressions in (6).
We have empirically tested both possible branches of solutions in (6), and we have chosen Risks 2020, 8, 108 6 of 28 to run the simulation because it prevents the risk-neutral dynamics from "exploding", i.e., from skyrocketing to unrealistically high prices in a short time.
Remark 2. Unfortunately, it does not seem possible for us to prove, from an analytical point of view, that either candidate solution in (6) solves (3), i.e., that it is a true solution. What the simulations suggest is that only the process in (11) is indeed a solution of (3). This insight is confirmed by the numerical experiments to a greater extent. In fact, it turns out that both processes θ 1,2 satisfy Condition (5). However, θ 1 makes the right hand side in (9) negative, and this implies that it does not solve (3). On the contrary, the process θ 2 -the one we pick in (11)-makes such term positive, hence it solves (3), indeed.
2.1.1. Simulation of the P θ -Dynamics The goal of this section is to carefully describe a procedure to generate the risk-neutral dynamics of the underlying asset price. This allows us to compute the simulated European call option prices with the Monte Carlo method.
Let r ∈ R + represent the constant annual interest rate. We assume that the driving Lévy process L is N IG-distributed and that the hypothesis of Theorem 1 hold (with respect to the truncation function h (x) = x1 D (x) , x ∈ R). Using the R-package yuima (see Iacus (2011)) we estimate the parameters of the model from the time series of the underlying asset log-prices. Next, we simulated a sufficiently large number of paths of the variance process σ 2 with the found parameters: by (11), from each of them we can get a trajectory of θ. The issue reduces to generating the paths of G, but note that, after the change of measure, L is not a Lévy process anymore because F θ t (dx) = e θ t x ν (dx) , t ≥ 0. Consequently, G is not a COGARCH process under the martingale measure. Then, in order to simulate its trajectories we use the canonical representation for semimartingales (Shiryaev and Jacod (2003), Chapter II, Theorem 2.34): for a d-dimensional semimartingale X = {X t } t with characteristics B, C, ν X relative to the truncation function h, it holds: The symbol denotes the integration with respect to a random measure: we refer to (Shiryaev and Jacod (2003), Chapter II, Section 1) for an extensive study of this topic. For the reader's convenience, we provide the basic definition. Definition 1. Let µ be a random measure on R + 0 × R and W : Ω × R + 0 × R → R be measurable function with respect to the product σ-algebra O ⊗ B (R), where O denotes the optional σ-algebra on Ω × R + 0 . The integral process W µ is defined by otherwise .
For an arbitrarily chosen ≤ 1, fix the truncation function h (x) := x 1 {|z|< } (x) for x ∈ R, and let (0, ν, γ h ) be the corresponding generating triplet of L. From the proof of Theorem 1, specifically from (4), we readily get the characteristics b t , 0, ν (σ t− ·) −1 (dx) of G under P, and invoking (A12) and Remark A2 in Appendix D we find them under P θ : We now represent G by (12), noting that G 0 = 0 and G c = 0 since c θ = 0. As regards the term B = {B t } t , we expand on the computations in (13), getting and obtain the variables of the process B by B t = (0,t) b θ s ds, t ≥ 0. At this point we need to approximate a trajectory of the process h (x) (µ G − ν G ). In order to do so, we neglect the contribution of the jumps of G with absolute value smaller than , focusing just on the term which in turn cancels out with one of the addends defining B. Heuristically speaking, in this approach we are assuming that the small jumps of the underlying asset price do not contribute to the determination of the option value in a significant way. However, we may refer to Asmussen and Rosiński (2001) for a more sophisticated procedure taking into account the variation of such small jumps in the case of Lévy processes. Finally we concentrate on the term (x − h (x)) µ G : we simulate its paths similarly to the Lévy-Itô decomposition theorem (see Sato 1999, Theorem 19.2). According to this result, if L is a Lévy process with generating triplet (A, ν, γ) and D a, is a compound Poisson process with constant ν (D 1,∞ ) and distribution Thus, we take a nonhomogeneous Poisson process with time-varying intensity and impose the time-varying jumps sizes to follow The jump times of this process have been simulated by the thinning algorithm described in Lewis and Shedler (1979).
Denote by N the number of iterations we run and by S i , for i = 1, . . . , N, the corresponding, simulated trajectories of the spot price under the pricing measure P θ . We obtain the value of an European call option with strike K and maturity T following the Monte Carlo method, i.e., computing the sample mean of the vector of components e −rT S i (T) − K + , i = 1, . . . , N.

. Empirical Results
We have empirically tested the Esscher method on the prices of call options on Apple Inc. stock (ticker symbol: AAPL) with fixed strike at $320. In Figure 1a, the simulated option prices are shown as function of their maturities. The average of the absolute values of percentage difference is 3.1646%.
Furthermore, we applied this method to the prices of call options on Microsoft Corporation stock (ticker symbol: MSFT) with strike $190: we plot the outcomes in Figure 1b. In this case, the mean of absolute values of percentage difference settles down at 5.0518%.
Finally we report the results of a test on call options with strike $910 with Tesla, Inc. (ticker symbol: TSLA) as underlying stock. In Figure 1c we can notice a big discrepancy between the real price and the predicted value for the shortest-term maturity taken into account, the latter being 37.2755% smaller than the former. However, if we neglect this first term we recover a result in line with the previous experiments, namely an average for the absolute values of percentage difference of 4.0555%.
In all the three cases, we set the number of Monte Carlo iterations N = 10 4 and we used the time series of the log −prices associated to the trading year before 19 February 2020, to estimate the parameters of the COGARCH model (see Section 2.1.1). Therefore, the simulations were run prior to Apple's stock split (on 28 August 2020) and Tesla's stock split (on 31 August 2020).
Remark 3. The real data was obtained from https://old.nasdaq.com, where American options are traded. Although our approximation generates the prices in an European model, it can be accepted as meaningful since: • the analyzed derivatives have short-term maturities, so we are allowed to ignore the dividend yield (however, TSLA does not pay dividends); • the time values of the options were always positive, implying that it is convenient to sell the option rather than exercising the call right.

Calibration with Entropic Penalty Term Method
In the Esscher method, the density process is completely defined by the spot prices: this is an intuitive drawback in liquid markets, because it does not use all the available information (e.g., the time series of derivative prices). This is a motivation to consider the Calibration Method , which consists in modeling the spot prices dynamics directly under a martingale measure "chosen" by the market.

The Model
We start the practical analysis by taking into account a driving, R-valued Lévy process L = {L t } t with generating triplet σ 2 , ν, γ on a probability space (Ω, F , Q) endowed with the augmented filtration of L, which satisfies the usual conditions and is denoted by G. We describe the stock prices S = {S t } t with an exponential Lévy model: S t := S 0 exp (rt + L t ) , t ≥ 0, with S 0 ∈ R + and r ∈ R + representing the constant annual interest rate. The discounting process R = {R t } t reduces to the deterministic function R t := exp (−rt) , t ≥ 0. Since we need Q to be a martingale measure for S we require: With these assumptions, Remark A1 in Appendix B proves that {exp (L t )} t is a martingale with E Q [exp (L t )] = 1 for every t > 0. Considering the discounted spot prices process S = S t t , defined by S t := R t S t = S 0 exp (L t ) , t ≥ 0, we can state that it is a Q-martingale with constant expectation equal to S 0 . In our discussion the driving Lévy process L will be a pure jump process, so its generating triplet simplifies as σ 2 = 0. In particular, due to assumption (ii) we get the next relation: Assume that there are N European call options with fixed maturity T available in the market, and for each j = 1, . . . , N let K j and C j be the strike price and the observed price of the j-th derivative, respectively. It is convenient to consider the quantities k j = log K j , j = 1, . . . , N. In this setting, S is the price process of the underlying asset. Since Q is a martingale measure for S, we can use it to price Thus, knowing the Lévy measure ν, the whole generating triplet could be recovered and therefore the option prices computed.
The reasoning which leads the calibration is choosing ν such that C j ν is "close" to C j for any j = 1, ..., N. Hence we pick the Lévy measure of the model by a least-squares procedure: Since the problem expressed in (15) is ill posed (see (Cont and Tankov 2004a, Chp. 13)), we introduce a regularization term. Let L 0 = L 0 t t be a, pure jump, driving Lévy process with generating triplet (0, ν 0 , γ 0 ) statistically estimated from the time series of the underlying asset price. Furthermore, let Q L 0 and Q L be the distributions on the Skorohod space (D, F D ; F) generated by L 0 and L, respectively, which are supposed to be equivalent on F t for every t > 0 (see Appendix E). We use the relative entropy as a measure of the distance from Q L 0 , so that by Remark A3 the issue is reduced to solving the optimization problem: The term α is a regularization parameter : the higher it is, the more we trust the initial distribution and the less importance we give to calibration. The existence of a solution to (16) has been studied, among others, in the paper Cont and Tankov (2004b). We are going to relax the assumptions in (16), considering the minimization in the set of the Lévy measures equivalent to ν 0 .

Numerical Approximation
The goal of this section is to describe in each and every detail a discretization procedure that allows us to tackle the optimization problem in (16). To this aim, it is necessary to express the objective functional in (16) as a function of the masses of the discretized Lévy measures. The steps of our argument are the following: estimate the parameters of the prior Lévy process L 0 from the time series of log-returns; II. introduce a discretization grid for the prior Lévy measure ν 0 and the driving Lévy measure ν (in what follows, the points of such a grid are denoted by y 1 < · · · < y N d ). Their discretized versions are denoted by ν 0 d and ν d , respectively; III. compute the discrete version of the entropy term in (16) as a function of the masses of ν d ; IV. use an approach based on the Fourier inversion theorem to get an approximation of the modified time values of the options at the log-strikes under scrutiny. This allows us to obtain (23), a discretized version of the objective functional in (16); V.
calculate explicitly the derivatives of the discretized objective functional to speed up the simulations; VI. choose the regularization parameter α.
Fix a maturity T and define the grid x j ∈ R + , j = 1, . . . , N, with x 1 < x 2 < . . . < x N : these points represent the log-strike prices of the options available on the market. Moreover, from the time series of the log-returns we obtain the parameters of the historical, N IG-distributed Lévy process L 0 = L 0 t t by a maximum likelihood procedure using the R-package ghyp. Note that, in this case, one can select any pure jump, infinitely divisible distribution: we opt for the NIG by analogy with the Esscher's method, and our choice is satisfactory, as the final goodness of fit between real data and model prices shows. First we introduce a discretization grid consisting in the points y h , h = 1, . . . , N d , with y 1 < y 2 < . . . < y N d , which constitutes a partition of the interval [−M, M] for some M > 0, and then we approximate the Lévy measure of L 0 with the discrete version ν 0 , where δ (a) denotes the Dirac measure at a point a and Now we take into account another measure which has the same mass points as the previous one: We note that the calibrated measures ν d and ν 0 d are equivalent, but ν d and the prior ν 0 are not. We understand the measure ν d as the discretized Lévy measure of the driving, pure jump Lévy processes L = {L t } t . We now want to compute the Radon-Nikodym derivative dν d dν 0 d with the aim to explicitly express the entropy term of (16) in the discrete version Set y 0 = −∞ and define the function Moving back to the integral (17), we get: Following the Carr and Madan approach (see Carr and Madan 1999, Section 3.2) for further details), we are able to to express the quantity ∑ N j=1 C j ν − C j 2 as a function of ν 1 , . . . , ν N d . The option price C ν (k) is not an integrable function, as a swift application of Lebesgue's convergence theorem shows that it converges to S 0 as k → −∞. Therefore, we focus on the modified time value z T , defined by z T (k) := C ν (k) − S 0 − e k−rT + , k ∈ R. We suppose that z T and its inverse Fourier transform ζ T are integrable, so by inversion (see, e.g., Rudin 1987, Theorem 9.11) we have Risks 2020, 8, 108 12 of 28 Fix k ∈ R; we first analyze the term In a similar way we get This provides the following expression for z T (k): The assumption (ii) ensures that R e y Q L T (dy) = 1, so we finally obtain The insight here is to use the Fourier transform and its inverse to estimate the values z T x j for every j = 1, . . . , N. Hence we fix a point u ∈ R and introduce the inverse Fourier transform Under suitable conditions which allow us to switch the order of integration we have ζ T (u) = − e −rT √ 2π (−∞,0] (y+log S 0 +rT,log S 0 +rT) e iuk S 0 e rT+y − e k dk Q L T (dy) + e −rT √ 2π (0,∞) (log S 0 +rT,y+log S 0 +rT) e iuk S 0 e rT+y − e k dk Q L T (dy) .
We turn our attention on the computation of the first term of the sum (19). An explicit calculation of the inner integral (respect to the Lebesgue measure) of such addend, indicated by I 1 , gives The same strategy allows us to compute the inner integral I 2 of the second addend in (19), as well. Precisely, for every y ∈ (0, ∞) we get Therefore we are able to represent the inverse Fourier transform as Both (A9) in Appendix B and assumption (i) make us conclude that By the definition of Ψ in (A8) (see Appendix B), substituting the discrete version ν d for the Lévy measure ν associated to L, as a consequence of (14) we get, for every u ∈ R, (1 − e y h ) ν h . (21) Plugging (21) into (20) we obtain, for u ∈ R, the final approximation With the aim to approximate z T at the points x j , for j = 1, . . . , N, we construct a new, uniform grid with mesh d containing the log-strike one. Specifically, fixed N ∈ N, we set , with A := 2π d , which is the size of the discretization interval. We carry out our construction so that for every j = 1, . . . , N and h = 1, . . . , N d there exists a n hj ∈ − N + 1, . . . , N − 1 such that x j − y h = x n hj . We eventually introduce the points of the discretization grid as u k := − A 2 + k∆, k = 0, . . . , N.
We then compute where w k are chosen by the trapezoidal rule, i.e., Therefore, knowing ζ T (u k ) for k = 0, . . . , N − 1, we can use a fast Fourier transform (FFT) to estimate the values z T ( x n ), for n = 0, . . . , N − 1, and, due to the symmetry of the grid, an inverse discrete Fourier transform to get z T ( x n ) for n = − N + 1, . . . , −1. Thus, we reduce the optimization problem (16) to the minimization in (R + ) d of the objective functional: The optimization is run under the L-BFGS-B algorithm, so we compute the derivatives of F to speed it up. We focus on deriving the first addend in (23), since calculating the entropic term is straightforward. Denote by g ν 1 , . . . , ν N d the argument of the exponential in (22) and define Under suitable integrability condition, for every u ∈ R and h = 1, . . . , N d we have Direct calculation from (18) leads, for every k ∈ R and h = 1, . . . , N d , to Taking k = x n , for n = 0, . . . , N − 1, we can approximate the first addend in (24) with an FFT: Summarizing, we can numerically calculate the derivatives of F : The regularization parameter α should be eventually determined; to this aim, we follow a procedure loosely inspired by the Morozov discrepancy principle (see the classical Morozov (1966)). We start off by minimizing the quadratic pricing error (15), ignoring the entropy term and using the discretized Lévy measure of L 0 as starting point. The value 0 of the functional at the found minimum ν is interpreted as the distance between the market and the selected model class. After that we consider the bid-ask spread of the options on the market and denote by its Euclidean norm. The absolute value of the difference between these two quantities is then allowed to be slightly greater, although it must maintain the same order of magnitude. This leads to the introduction of another term := c | 0 − |, where c is a positive constant to be picked. At this point the functional in (16) is strictly increasing in α, so we choose the regularization parameter as follows: For our applications, taking c ≈ 2 has proven to be a satisfactory choice in terms of goodness of fit.

Empirical Results
The final step is to empirically apply this method on prices of real-traded options. In particular, we have tested it with derivatives on the same stocks as those considered for the Esscher's method. In order to estimate the parameters of the historical Lévy process, we used the same time series of log-prices as those of the Esscher's method (see Section 2.1.2) All the call options under scrutiny expire in January 2021: 11 months from the time of simulation.
In the case of call options on Apple Inc. stock (AAPL) we obtained an average of the absolute values of percentage difference of 3.5794%: Figure 3a displays the results of such implementation. As regards call options on Microsoft Corporation stock (MSFT), the absolute values of percentage difference is 1.6425% and the results are shown in Figure 3b. Finally we calibrated our method to the prices of call options on Tesla, Inc. stock (TSLA): Figure 3c shows the outcomes. Here the mean of absolute values of percentage difference settles down at 0.9148%. This time we retrieved the real prices from https://finance.yahoo.com, but the same considerations as Remark 3 apply.

Conclusions
The main purpose of this research is to quantify the performances of two option pricing methods in a liquid market like NASDAQ. The first approach is based on the Esscher measure, the second one on the calibration to real-traded call option prices with an entropic penalty term.
Our experimental analysis shows that, from a computational point of view, the Esscher method with 10 4 Monte Carlo iterations is far faster than the calibration method (see Figure 2). The following table reports the exact execution times that we have obtained with an Asus ZenBook UX31A: This fact is especially noticeable when we need to simulate the prices of options with different maturities and same strike. On the other hand, the calibration method allows generating the entire option chain for a fixed maturity with great precision. Moreover, it is more stable than the previous one, as it provides results closer to real data for a larger range of stock prices.
Comparing averages of absolute values for percentage difference between real and simulated prices, the calibration procedure offers better outcomes than Esscher's, with the only exception of AAPL call options, where the latter outperforms the former by approximately 42 basis points. Nevertheless, the Esscher method only needs the historical time series of the underlying asset price to be applied, so it is feasible also in markets with few derivatives. This study shows that the Esscher measure is an appealing and efficient solution to price call options in illiquid-or low liquid-financial markets.

Future Research
Considering the future work on the subject, one of the directions worth investigating is the Linear Esscher measure method. The importance of focusing on this approach is given by its intrinsic link to the minimal entropy Hellinger martingale measure, as indicated by Rheinländer and Sexton in (Rheinländer and Sexton (2011), Chp. 9), and, in much more detail, by Choulli and Stricker in Choulli and Stricker (2004). Moreover, it would be interesting to extend our study to other asset classes like metal futures, and to compare it with the results obtained in Chen (2011).

Appendix A. Normal Inverse Gaussian Distribution
In our work, we extensively use the N IG distribution, because it provides a good fit for the analyzed data. A probability measure µ on R is said to be N IG with parameters (α, β, µ, δ), where 0 ≤ |β| < α, µ ∈ R and δ > 0, if it has the following density: with K 1 that denotes the modified Bessel function of the third kind with index 1 and x ∈ R. For a definition of K 1 we refer to Abramowitz and Stegun (1970). The next procedure is the classical construction of such a distribution.
A probability measure µ on (R + , B (R + )) is said to be a generalized inverse Gaussian (GIG) distribution with parameters ν ∈ R, δ > 0, γ > 0 if its density with respect to the Lebesgue measure is In order to show that f GIG is a density function, we need the following representation for K ν (Watson 1966, Formula (8), p. 182), the modified Bessel function of the third kind with index ν: The computation of the next integral allows us to find the normalization constant in (A2): where in the second equality we made the substitution y = γ δ x. Since K ν > 0 in R + from (A3), we can conclude that f GIG is actually a density function on R + . Let us fix two other parameters α, β ∈ R such that 0 ≤ |β| < α. For any µ ∈ R, y ∈ R + we denote by f N (·; µ, β, y) the density of the probability measure on (R, B (R)) generated by a random variable X ∼ N (µ + βy, y). It is possible to introduce a new distribution on (R, B (R)) considering the following function: where in the last but one equality we made the substitution z = α δ 2 +(x−µ) 2 y and in the last we used (A3).
A straightforward application of Tonelli's theorem shows that f GH (·; ν, α, β, µ, δ) is a density function: the corresponding probability measure on (R, B (R)) is called Generalized Hyperbolic distribution.
We recover the N IG density (A1) taking ν = − 1 2 in (A4). In fact, for every ν ∈ R, it results K −ν = K ν and the representation (Abramowitz and Stegun (1970), Formula (9.6.23)) which holds for every ν > − 1 2 , enables us to conclude As a particular case of Generalized Hyperbolic distribution, the N IG probability measure is infinitely divisible (see, e.g., Barndorff-Nielsen and Halgreen 1977;Eberlein and Hammerstein 2004). The moment generating function for a random variable X ∼ N IG (α, β, µ, δ) is given by Risks 2020, 8, 108 19 of 28 and its generating triplet (see Appendix B for the definition of this concept) is given by For an explicit calculation, we refer to the paper Barndorff-Nielsen (1998).

Appendix B. Cumulant Function of Lévy Processes
Fix a probability space (Ω, F , P). It is well known that, if X = {X t } t is a Lévy process, then X t is infinitely divisible for every t ∈ R + 0 . On the other hand, for every infinitely divisible probability measure µ on R d , there exists a Lévy process X = {X t } t , which is unique up to identity in law, such that X 1 ∼ µ.
The Lévy-Khintchine representation theorem states that, given an infinitely divisible distribution µ on R d , its characteristic functionμ can be written as: where and A is a symmetric, positive semidefinite d × d matrix. The representation (A7) ofμ by (A, ν, γ) is unique and the triplet (A, ν, γ) is called generating triplet of the distribution µ. The generating triplet of a Lévy process X = {X t } t is the generating triplet of X 1 . Note that it is not necessary to take 1 D to have integrability in (A7). In fact, let h : R d → R d be a bounded function, with h (x) = x in a neighborhood of 0: we call it a truncation function. Since, for every z ∈ R d , in a neighborhood of 0 we have e i z,x − 1 − i z, h (x) ≤ 1 2 |z| 2 |x| 2 and, further, ) ν (dx) component wise, then the Lévy-Khintchine formula with respect to h becomes: We note that only γ h depends on the choice of the truncation function. It is possible to "extend" the argument of the exponential function in (A7), called characteristic exponent and denoted by ψ, to a subset of C d . Precisely, taking into account the set we can define the function Ψ : D → C as follows: where D := w ∈ C d : Re (w) ∈ C . We readily note that iz ∈ D for every z ∈ R d and the following equality holds: Ψ (iz) = ψ (z) , z ∈ R d . In this sense Ψ extends ψ to a subset of C d . We pinpoint that, in this paper, we set u, w := ∑ d j=1 u j w j , u, w ∈ C d , so ·, · is not the C d -Hermitian inner product. Theorem 25.17 in Sato (1999) affirms that, if X = {X t } t is a Lévy process generated by (A, ν, γ), then E e w,X t = e tΨ(w) for any w ∈ D, t > 0 : (A9) we call Ψ the cumulant function of X. As we have previously done, we can introduce a truncation function h and express Remark A1. Let X = {X t } t be a R d -valued Lévy process defined on a probability space (Ω, F , P), Ψ be its cumulant function and θ ∈ R d be such that E [exp ( θ, X t )] < ∞ for some t > 0. By Theorem 25.3 in Sato (1999), this is equivalent to require that θ ∈ C. We introduce the random variables Note that M = {M t } t is integrable, by assumption and the fact that tΨ (θ) is constant in Ω for every t ≥ 0. Let us construct the minimal augmented filtration F = (F t ) t≥0 of X, i.e., F t = σ N F 0 t for any t ≥ 0, where F 0 t t is the natural filtration of the process and N the collection of F -negligible sets; obviously, M is F-adapted. According to (Protter (2005), Chapter I, Theorem 31) F is right-continuous, so it satisfies the usual conditions, as well. If we fix t > s ≥ 0, then using (A9) and the properties of the Lévy-increments we see that M is a martingale with mean 1: Definition A1. Fix T > 0 and let θ ∈ C. The probability measure P θ on F T , with P θ ∼ P on F T , defined by dP θ dP := M T is called the Esscher transform of P with respect to θ. The density process is given by

Appendix C. Characteristics of Semimartingales
The aim of this section is to generalize the concept of generating triplet of a Lévy process to semimartingales. Here we mainly follow (Shiryaev and Jacod (2003), Chapter II).
We start off by fixing a stochastic basis (Ω, F , P; F), with F which satisfies the usual conditions. Given two stopping times S, T, the stochastic interval is the random set It is important to observe that the sections t ∈ R + 0 : (t, ω) ∈ A , for ω ∈ Ω, are at most countable when A is a thin set.
Theorem A1. If X = {X t } t is a RCLL, adapted process, then the random set {∆X = 0} is thin. We refer to (He et al. 2018, Theorem 3.32) for a proof. Thanks to this result we can easily introduce the next concept.
Definition A3. Let X = {X t } t be an adapted, RCLL, R d -valued process. The measure µ X on R + 0 × R d defined by where δ (a) denotes the Dirac measure at a point a, is called measure associated with its jumps.
We now consider a process X = {X t } t which is a d-dimensional semimartingale, highlighting that we restrict our attention to RCLL and F-adapted semimartingales. Let h be a truncation function; using the measure µ X it is possible to derive the characteristics of X (see Shiryaev and Jacod 2003, Chp. II, Definition 2.6), which we denote by the triplet B, C, ν X . They are defined up to a P-null set and only B depends on the choice of h.
Every Lévy process-once we endow the probability space in which it is defined with its minimal augmented filtration-is a PI IS process (a RCLL, adapted process which starts at 0 and has independent and stationary increments), which in turn is a semimartingale. Thus, considering a Lévy process L = {L t } t with generating triplet (A, ν, γ h ) relative to a truncation function h, we can apply (Shiryaev and Jacod 2003, Chp. II, Corollary 4.19) together with the uniqueness of the Lévy-Khintchine representation to express its characteristics as (γ h t, At, dt ⊗ ν (dx)) .

Appendix D. Laplace Cumulant and Geometric Esscher Measure
On a stochastic basis (Ω, F , P; F), with the filtration F which satisfies the usual conditions of right-continuity and completeness, we define a d-dimensional semimartingale X = X 1 , . . . , X d , i.e., a process which can be decomposed as X = M + A, a sum of a local martingale M and a process A of locally finite variation which is called drift, or additive compensator, of the semimartingale, and with characteristics B, C, ν X relative to a truncation function h (see Appendix B for the definition). In particular, a special semimartingale is a semimartingale where its finite variation part is predictable; such a special semimartingale has a unique semimartingale decomposition. In view of (Shiryaev and Jacod 2003, Chp. II, Proposition 2.9) we can assume, without loss of generality, that ν X {t} × R d ≤ 1 identically. An exposition on semimartingale characteristics is outside of the scope of this paper, for this we refer to Shiryaev and Jacod (2003). Now we introduce the stochastic logarithm and the stochastic exponential, denoted by L and E , respectively. The symbol · denotes stochastic integration, the exact meaning can vary according to the context. Let X be a semimartingale; the set of stochastic processes which are integrable with respect to X is denoted by L(X). This notion is quite intricate, and it is not the purpose of this article to give an overview on the theory of stochastic integration. For this, we refer to (Protter 2005, Chp. IV, Section 2). The solution of the stochastic integral equation is denoted as the stochastic exponential E (X) of the semimartingale X. Here, the process Y − = is a semimartingale such that both Y and Y − do not vanish, then the process denoted by L(Y), is called the stochastic logarithm of Y, and is the unique semimartingale such that For a detailed exposition of these concepts we refer to (Shiryaev and Jacod 2003, Chp. II, Section 8) Definition A4. Let θ ∈ L (X) be an admissible strategy such that θ · X is exponentially special, i.e., exp(X) is a special semimartingale. The Laplace cumulant K X (θ) of X at θ is the additive compensator of the real-valued, special semimartingale L (exp (θ · X)) .
The modified Laplace cumulant K X (θ) of X at θ is the process Note that the additive compensator of the process is predictable because exp (θ · X) is special and 1 (exp(θ·X)) − is predictable. Therefore, L (exp (θ · X)) is a special semimartingale and the definition of K X (θ) is well posed. As far as K X (θ) is concerned, recalling (Shiryaev and Jacod 2003, Chp. III, Theorem 7.4) we have Thus, the stochastic exponential in (A10) is strictly positive and the process K X (θ) is well defined, as well.
If θ · X is exponentially special, then K X (θ) is its exponential compensator, see (Shiryaev and Jacod 2003, Chp. III, Theorem 7.14). This means that the process Z θ = Z θ t t , defined by Z θ t := exp θ · X t − K X (θ) t , t ≥ 0, is a local martingale starting at Z θ 0 = 1. We now recall the concept of uniform integrability.
Definition A5. A non-empty set Φ of real-valued random variable defined on a probability space (Ω, F , P) is uniformly integrable if Supposing that Z θ is a uniformly integrable martingale, then we can set P θ (dω) := Z θ ∞ P(dω), where Z t → Z ∞ a.s. and in L 1 as t → ∞. It is straightforward to show that Z θ is the density of P θ relative to P; besides, these two distributions are locally equivalent as Z θ t > 0, t ≥ 0. The following result (Shiryaev and Jacod 2003, Chp. III, Theorem 7.18) establishes a necessary and sufficient condition such that the process S i := S i 0 exp X i , S i 0 ∈ R + is a P θ -local martingale for every i = 1, . . . , d.
Theorem A2. Let θ ∈ L (X) be such that θ · X is exponentially special and Z θ is a uniformly integrable martingale. Define Then the processes S i = S i 0 exp X i are P θ -local martingales if and only if θ (i) · X is exponentially special and K X θ (i) = K X (θ) up to evanescence for every i = 1, . . . , d.
We call P θ geometric Esscher measure, or Esscher martingale transform for exponential processes. For d = 1, in case the geometric Esscher measure exists, (Kallsen and Shiryaev 2002, Theorem 4.2) provides its uniqueness: this means that, if we find another processθ ∈ L (X) such that S = S 0 exp (X) is a Pθ-local martingale, then Pθ = P θ .
We conclude with some technical results. Let θ ∈ L (X) be such that θ · X is exponentially special. By (Shiryaev and Jacod 2003, Chp. III, Theorem 7.4) and (Shiryaev and Jacod 2003, Chp. II, Proposition 2.9) we can write K X The next lemma (Kallsen and Shiryaev 2002, Lemma 2.11) allows us to express the drift process as a function of the characteristics of the semimartingale.
Lemma A1. Let θ ∈ L (X) be such that θ · X is a special semimartingale. Then its drift process D X (θ) = Finally, we use Girsanov's theorem to compute the characteristics B θ , C θ , ν X θ of X under P θ -provided its existence-relative to the same h:

Appendix E. Lévy Processes on Skorohod Space and Relative Entropy of Distributions
Let (Ω, F , P) be a probability space which carries a R d -valued, additive process {X t } t with system of generating triplets {(A t , ν t , γ t )} t . For every t ≥ 0 we define x t : D → R d , where D is the Skorohod Space, as x t (ξ) := ξ (t) , ξ ∈ D, and we introduce the σ-algebra F D := σ ({x t , t ≥ 0}). Now we are in the position to define a natural filtration F := (F t ) t≥0 on the measurable space (D, F D ), where F t := σ ({x s , 0 ≤ s ≤ t}) , t ≥ 0. Since {X t } t is an RCLL process, we set the map φ : and, for every B ∈ B R d and t ≥ 0, we have This enables us to construct the pushforward measure P D on (D, F D ), that is, We now focus on the stochastic process {x t } t defined on the probability space D, F D , P D . Fix a cylinder set C ∈ F D , i.e., for some t 1 < t 2 < . . . < t n , B 1 , . . . , B n ∈ B R d and n ∈ N. By (A13), we have P D (x t 1 ∈ B 1 , . . . , x t n ∈ B n ) = P D (C) = P φ −1 (C) = P (X t 1 ∈ B 1 , . . . , X t n ∈ B n ) .
Thus, {x t } t and {X t } t are identical in law, whence {x t } t is an additive process with the same system of generating triplets as {X t } t . Specifically, if {X t } t were a Lévy process, then {x t } t would inherit the temporal homogeneity, so it would be a Lévy process, as well.
Consider two Lévy processes ({x t } t , P) and ({x t } t , P ) on the Skorohod space (D, F D ) endowed with the filtration F. The next theorem (Sato 1999, Theorems 33.1 & 33.2) provides us with conditions which ensure P F t ∼ P F t for any t > 0.
Besides, considering the function φ : R d → R defined by φ := log dν dν , In this case, chosen η ∈ R d such that γ − γ − |x|≤1 x (ν − ν) (dx) = Aη, there exists a process U = {U t } t defined on D which satisfies the following properties: (i) U is a P-Lévy process on R with generating triplet (ii) E P e U t = E P e −U t = 1 for every t ≥ 0; (iii) e U t = dP F t dP F t P − a.s. for every t > 0.
An explicit expression of the process U, which is unique up to identity in law, can be retrieved in (Sato 1999, Theorem 33.2). Definition A6. Given two probability measures P, P on a measurable space (Ω, F ), the relative entropy H (P, P ) of P with respect to P is defined by H P, P :=    Ω log dP dP (ω) P (dω) , if P P ∞, otherwise .
Recall that given two distributions P, P on (Ω, F ) we have H (P, P ) ≥ 0, with equality if and only if P = P .
We finally present a theorem which explicitly computes the relative entropy of two equivalent Lévy processes in the Skorohod space as a function of their generating triplets.
Theorem A4. Let ({x t } t , P), ({x t } t , P ) be Lévy processes on R d defined on (D, F D ) with generating triplets (A, ν, γ) and (A , ν , γ ), respectively. Suppose that P F t ∼ P F t for every t > 0 and choose η ∈ R d such that Assume also that E P [g (U t )] < ∞ for some t > 0, where g (x) := (|x| ∨ 1) e |x| , x ∈ R. Then for every T > 0 it results Proof. Let us fix a finite time horizon T > 0. For any z ∈ (0, 1), by assumption we have We introduce the moment generating function M U T (z) := E P e zU T = e TΨ(z) , z ∈ (0, 1) , where Ψ is the cumulant function of the Lévy process U, i.e., Ψ (z) = 1 2 σ 2 U z 2 + γ U z + R (e zx − 1 − zx1 D (x)) ν U (dx) , z ∈ (0, 1) , and the last equality is ensured by (A9) in Appendix B. Actually M U T is well defined for z = 1 too, with M U T (1) = E P e U T = e TΨ(1) = 1, by A3 in Theorem A3. We can see that M U T is differentiable in (0, 1) with M U T (·) = E P U T e ·U T . Indeed, for every z ∈ (0, 1), M U T (z) = R e zx P U T (dx) and we can derive under integral sign since |x| e zx ≤ g (x) ∈ L 1 P U T , z ∈ (0, 1) , x ∈ R.
At this point the dominated convergence theorem readily shows that On the other hand, we introduce the function f (z) := e TΨ(z) , z ∈ (0, 1) . Even in this case we can affirm that f is differentiable in its domain, with derivative provided by f = Te TΨ Ψ . In particular the following equality is true: Ψ (z) = σ 2 U z + γ U + R (xe zx − x 1 D (x)) ν U (dx) , z ∈ (0, 1) .
In fact, we can derive under the integral sign in (A15) as, for every z ∈ (0, 1), it results that |xe zx − x| ≤ 1 + e, x ∈ D, with xe zx − x ≤ Cx 2 in a neighborhood of 0 for some constant C > 0 which is independent of z. Moreover, |x| e zx ≤ g (x) for |x| > 1, with |x|>1 |x| e |x| ν U (dx) < ∞ by Theorem 25.3 in Sato (1999). Applying another time the Lebesgue's convergence theorem we arrive at lim z→1 − M U T (z) = Te TΨ(1) σ 2 U + γ U + R (xe x − x 1 D (x)) ν U (dx) . Therefore E P U T e U T = Te TΨ(1) σ 2 using the expression of σ 2 U , ν U , γ U in A3 of Theorem A3. Since dP F t dP F t = e U t for every t > 0 (see A3 in Theorem A3) and we obtain (A14) and complete the proof.
Remark A3. In the case of two R-valued Lévy processes ({x t } t , P), ({x t } t , P ) with generating triplets σ 2 , ν, γ and σ 2 , ν , γ , respectively, under the hypothesis of the previous theorem (A14) reduces to If we are dealing with pure jump processes (e.g., N IG processes), the first term in the sum of the right-hand side is 0; if instead σ 2 > 0, then we have restoring (Cont and Tankov 2004a, Proposition 9.10).
Author Contributions: All authors have read and agreed to the published version of the manuscript.
Funding: Open Access Funding by TU Wien. This research received no external funding.