Maximum Entropy Evaluation of Asymptotic Hedging Error under a Generalised Jump-Diffusion Model

: In this paper we propose a maximum entropy estimator for the asymptotic distribution of the hedging error for options. Perfect replication of ﬁnancial derivatives is not possible, due to market incompleteness and discrete-time hedging. We derive the asymptotic hedging error for options under a generalised jump-diffusion model with kernel bias, which nests a number of very important processes in ﬁnance. We then obtain an estimation for the distribution of hedging error by maximising Shannon’s entropy subject to a set of moment constraints, which in turn yields the value-at-risk and expected shortfall of the hedging error. The signiﬁcance of this approach lies in the fact that the maximum entropy estimator allows us to obtain a consistent estimate of the asymptotic distribution of hedging error, despite the non-normality of the underlying distribution of returns.


Introduction
The theory of pricing and hedging options has been the centre of attention in modern mathematical finance since the seminal Black-Scholes model. It provides a theoretical value and hedging strategy for European options, under the key assumption that there exists a trading strategy that constructs a portfolio that perfectly replicates the pay-off of an option. Furthermore, Black and Scholes assume the underlying stock price follows a geometric Brownian motion, and trading may take place in continuous time. With these assumptions, they show that the initial value of the replicating portfolio provides the initial price of the option. Moreover, the Black-Scholes analysis demonstrates that an option can be created synthetically by dynamically trading in the underlying asset. Nevertheless, it is well accepted that the perfect replication of options by any self-financing strategy is impossible, due to market incompleteness as well as discrete-time hedging. These two sources of error are termed the jump error and the gamma error.
Many researchers, (see for example , Cont et al. (2007), and Kennedy et al. (2009)) have studied the problem of hedging an option in an incomplete market, particularly where stock prices may jump. It is well understood that, except in very special cases, martingales with respect to the filtration of discontinuous processes cannot be represented in the form of a unique self-financing strategy, which leads to market incompleteness. At the jump time, both the model price of the option and value of the hedging portfolio jump. The former is a non-linear function of the stock price but the latter is a linear function of the stock price. Therefore, the jump induces a discrepancy between the value of the option and its replicating portfolio, and thus leads to a jump error. Furthermore, these researchers generally assume that the hedging portfolio can be continuously rebalanced, which is only possible in the absence of transcription costs. In practice, this level of liquidity is not possible and market practitioners rebalance their hedging portfolio using discrete-time observations, just a few times per trading day. The discrete hedging of derivatives securities leads to the gamma error. This error is not easy to measure because the stochastic analysis techniques are not available in discrete time. Wang et al. (2015) provides a recent analysis of literature regarding hedging in a discrete time incomplete market. We base our research on the seminal work of Bertsimas et al. (2000), Hayashi and Mykland (2005) developed a methodology to analyse the discrete hedging error in a continuous-time framework using an asymptotic approach. This methodology was further developed by Tankov and Voltchkova (2009) who investigate the gamma risk via establishing a limit theorem for the renormalised error when the discretisation step tends to zero. Additionally, Rosenbaum and Tankov (2014) discuss the optimality conditions of discretised hedging strategies in the presence of jump.
In this paper we contribute to the literature by approaching the problem from a different angle. We characterise the risk in a dynamic hedge of options through the asymptotic distribution of hedging error. In particular, we investigate the case of conventional delta hedging for a European call option, although other types of options may be treated in a similar manner. Furthermore, we obtain an estimation for the distribution of hedging error by maximising Shannon's entropy (Shannon (1948)) subject to a set of moment constraints, which in turn yield the value-at-risk (VaR) and expected shortfall (ES) of the hedging error, two widely-used risk metrics in finance. In the literature there exist two dominant approaches for constructing the distribution of hedging error, namely, the parametric and non-parametric approaches. The new approach that we propose in this paper chooses the probability distribution with the most uncertainty, or maximum entropy, subject to what is known. This allows us to obtain a consistent estimate of the asymptotic distribution of hedging error, despite the non-normality of the underlying distribution of returns. As a result, we can drive a very generalised modelling framework, which can be applied in different areas of derivatives pricing.
We first extend the methodology introduced in Hayashi and Mykland (2005) to model the asymptotic hedging error for vanilla call options when the underlying asset is governed by a generalised jump-diffusion model with kernel bias. The class of kernel biased completely random measures is a wide class of jump-type processes that can nicely be represented by a generalised kernel-biased mixture of Poisson random measures. The importance of using a kernel biased completely random measure is to derive a variety of forms of distortion of jump sizes through the kernel link function. For example, Fard and Siu (2013) discuss the importance of this representation as it provides great flexibility in modelling different types of finite and infinite jump activities compared with existing studies.
Next, we estimate the probability density function of the hedging error using the maximum entropy (ME) methodology. It allows the agent to effectively combine the aforementioned risk factors dynamically over time to update its belief about the possible distribution governing the terminal hedging error. Maximum entropy is widely used in estimation and information theory, in which beliefs are updated so that the posterior coincides with the prior as closely as possible. Furthermore, ME methodology only updates those aspects of beliefs for which new evidence was gained (cf., Cover and Thomas (2012), Saghafian and Tomlin (2016) and references therein). Despite its widespread use in fields such as estimation theory, physics, statistical mechanics, and information theory among others, it is only recently that researchers have begun to appreciate its usefulness in econophysics. For instance, Xi et al. (2014) use the maximum entropy model to study business cycle synchronisation of the G7 economic system. Gzyl and Mayoral (2016) uses the maximum entropy principle to develop a non-parametric method of determining the prices of the zero coupon bonds, when the only information available consists of the prices of a few coupon bonds. Chan (2009) proposes a general modelling framework for the EM algorithm in approximating the distribution of financial returns in order to develop entropy-based risk metrics; Geman et al. (2014) and Xu et al. (2014) expand on Chan (2009) and develop a robust optimisation framework for portfolio construction; Mistrulli (2011) and Zhou et al. (2013) applies the entropy maximisation principle to measure financial contagion and systemic risk. Geman et al. (2014), who use ME density in the VaR context, make an interesting observation that the real world is mostly ignorant about the importance of true probability distributions. They further point out that historically, finance theory has had a preference for parametric, less robust, methods. An approach that is based on distributional and parametric certainties may be useful for research purposes but does not accommodate responsible risk taking. Their study shows the importance of the use of true probability distributions in VaR calculations.
The remainder of this paper is structured as follows. In Section 2 we present the calculations for pricing European options under a generalised jump-diffusion model with kernel bias. Furthermore, we generalise the Hayashi and Mykland (2005) framework to derive the asymptotic hedging error, stemmed from the market incompleteness and discrete hedging. In Section 3, we obtain an estimation for the distribution of hedging error by maximising Shannon's entropy subject to a set of moment constraints, which in turn yield the value-at-risk and expected shortfall of the hedging error. Section 4 provides a numerical analysis to highlight the applicability of the method. Section 5 concludes the paper.

Modelling Framework
The inadequacy of the constant volatility of the Black-Scholes (henceforth, BS) option valuation model to replicate the characteristics of observed option prices is empirically evidenced in the finance literature. For example, the volatility smile for equity options exhibits a consistent pattern illustrating that implied volatilities are lower for options with higher strike prices. This is particularly the case particularly for short maturity options. Several methods are suggested to accommodate the volatility smile in option pricing, but, stochastic volatility models (see e.g., Heston 1993;Hull and White 1987) and jump diffusion models (see e.g., Bates 1991;Merton 1976;Naik and Lee 1990) have become popular alternatives to BS's constant volatility.
Many studies provide evidence that stochastic volatility models are a significant improvement (cf. Dotsis et al. (2007)). Nevertheless, their applicability in empirical studies suffers from two major drawbacks, namely, implausible correlation structure between returns and volatility, and excessively high "volatility of volatility" (cf. Bates (2000); Diavatopoulos et al. (2012); Dotsis et al. (2007)). Amongst these, the seminal work of Bakshi et al. (1997) reviews a range of alternative option pricing models, and conclude that stochastic volatility is of "first-order importance in improving upon the [Black-Scholes] formula", but adding jumps may provide further improvement based on an out-of-sample test.
As a result, a large body of the recent literature has focused on augmenting stochastic volatility models with jumps. For example, Bakshi et al. (2003), Eraker et al. (2003), and Bakshi et al. (2012) conduct a set of empirical investigations for the case of index options, evidencing the importance of adding the jump component in the returns process. Chang et al. (2013) finds that joint time series data of the underlying S&P 500 index and options on it strongly reject a stochastic volatility model without jumps. Shanahan et al. (2016) price long-maturity equity linked products, as a composition of three embedded options, using the Meixner process (to capture jumps) and a diffusion process to capture stochastic volatility.
On the other hand, option pricing with both jumps and stochastic volatility is rather cumbersome, as analytical solutions are rarely achievable. Even in the relatively simple context of vanilla American options, the studies of Bakshi et al. (2003) and Bakshi et al. (2012) uses European option calculations on a restricted set of out-of-the-money options.
Recently, some theoretical improvement have been offered in the literature, such as the polynomial approximation of stochastic volatility in Shanahan et al. (2016). However, due to the absence of supportive empirical evidence, we chose a more practical alternative, proposed by Andersen and Andreasen (2000). By so doing, we aim to extend the analysis of Dupire (1994) to the case of jumps, whereby a model combining jumps with a deterministic volatility is developed. This type of model captures the observed behaviour of implied volatilities (see e.g., Andersen and Andreasen 2000). In addition, it is much easier to use for contracts where no analytic solution is readily available. Moreover, note that adopting a deterministic local volatility function approach has an advantage in calibrating European option prices. Indeed, instead of solving a set of backward equations, one for each option of a different strike and maturity, one only needs to solve a single one-dimensional forward equation for all strikes and maturities.

Financial Markets
We fix a complete probability space (Γ, F , P), where P is the real-world probability measure. Let T denote the time index set [0, T] of the economy. Let {r(t)} t∈T be the instantaneous market interest rate of a money market account. Then, the dynamics of the value of the risk-free asset, {B t } t∈T would be: To model different types of finite and infinite jump activities, we adopt the kernel biased representation of completely random measures of James (2002James ( , 2005. To proceed, consider the measurable space (T , B(T )), where B(T ) is the Borel σ-field (generated by the open subsets of T ). We denote by B 0 the family of Borel sets U ∈ R + , with closureŪ not containing 0. Define X = T × R + . Under these definitions, the measurable space (X , B(X )) is explicitly given by Let N(., U) define a Poisson random measure for all U ∈ B 0 . We denote by N(dt, dz) the differential form of measure N(t, U). In addition, define ρ(dz|t) as a Lévy measure that depends on t. Let η be a σ-finite (nonatomic) measure on T . Following James (2005), we assume that there exists an arbitrary positive function h(z) on R + , which along with ρ and η are chosen such that: and h 2 (z) ≤ t ρ(z), where is a cádlág F t -adapted process. Define the intensity measure: as well as the kernel biased completely random measure: The latter is a kernel-biased Poisson random measure N(dt, dz) over the state space of the jump size R + with the mixing kernel function h(z). We can replace the Poisson random measure with any random measure and choose some quite exotic functions for h(z) to generate different types of finite and infinite jump activities. Let {W t } t∈T denote a standard Brownian motion on (Ω, F , P) with respect to the P-augmentation of its natural Let µ t and σ t denote the drift and volatility of the market value of the underlying asset, respectively. Consider a random jump process A := {A(t)|t ∈ T }, such that: where A 0 = 0. We assume under P the price process {S t } t∈T is defined as S t := exp(A t ) so that: with S 0 = 1.

Esscher Transform
It is well known that no arbitrage opportunities are necessary for the determination of a unique equivalent risk neutral martingale measure, thus ensuring the fair valuation of the option (Pilska 1997). In this paper, we emphasise on the possibility of having incomplete markets, and as such there may be more than one equivalent martingale measure, i.e., more than one no-arbitrage price. There are different methods to price and hedge derivative securities in incomplete financial markets. For example, one can choose an equivalent martingale measure by minimising the quadratic utility of the terminal hedging errors (see e.g., Follmer and Sondermann 1986;Follmer and Schweizer 1991;Schweizer 1995). One can also adopt an economic approach based on the marginal rate of substitution to select a pricing measure via a utility maximisation problem (see e.g., Davis 1997). Finally, one may employ the minimum entropy martingale measure method to determine the equivalent martingale measure (see e.g., Avellaneda 1998;Fard and Siu 2012;Frittelli 2000).
In this study we employ the Esscher transform to determine an equivalent martingale measure for the valuation of the option (see Gerber and Shiu 1994). The method provides market practitioners with a convenient and flexible way to value options. For example, it has been shown in  that for exponential Lévy models, the Esscher martingale transform for the linear process is also the minimal entropy martingale measure, i.e., the equivalent martingale measure which minimises the relative entropy, and this measure also has the property of preserving the Lévy structure of the model. In the framework of exponential Lévy models, the study of equivalent martingale measures the relationships and optimality properties, which has been developed in several directions, see Esche and Schweizer (2005), Hubalek and Sgarra (2006), and Tankov (2003) and the references therein.
Let F A := {F A t } t∈T and F S := {F S t } t∈T denote the P-augmentation of the natural filtration generated by A and S, respectively. Since, F A and F S are equivalent, we can use either one as an observed information structure. Write B(T ) for the Borel σ-field of T and let BM(T ) denote the collection of B(T )-measurable and nonnegative functions with compact support on T . For each process θ ∈ BM(T ), write: such that θ is integrable with respect to the return process. Let {Λ t } t∈T denote a G-adapted stochastic process: is a Laplace cumulant process and takes the following form: Therefore, The goal is to use Λ t in (3) as the Radon-Nikodym derivative to change the historical probability measure to the risk-neutral measure. Therefore, (3) is an essential part of our pricing formulation. An important characteristic of risk-neutral measure is that every discounted price process is a martingale under this measure. To establish this key property, it is paramount to show that (3) This new measure dP θ is defined by the Esscher transform Λ T associated with θ ∈ L(A).
The local-martingale condition, i.e., there exists an equivalent martingale measure under which discounted asset prices are local-martingales in the absence of arbitrage, is the foundation of asset pricing theory. Below, we state a necessary and sufficient condition for the local martingale condition in our framework.
Proposition 1. For each t ∈ T , let the discounted price of the risky asset at time t be: Then the discounted price process S := { S(t)|t ∈ T } is an P θ -local-martingale if and only if θ t := θ, X t , t ∈ T , is such that θ := (θ 1 , θ 2 , ..., θ N ) ∈ R N satisfies the following equation: See Appendix A for proof. The results from the Lemma 1, Equation (4), and Proposition 1, allow us to use (3) to drive the risk-neutral dynamics of the return process.
Similarly, we can derive the risk-neutral price process of the reference portfolio.
Proposition 3. The price process of the reference portfolio S under P θ is: Proof. Proof of Proposition 3. Recall S t := exp(A t ). Then the proof can easily follow by applying Ito's Lemma and the martingale condition (5) to (6).
We study the hedging of a European option with pay-off function G using the popular delta hedging strategy. The option price is given by: where we assume C ∈ C ∞ ([0, T) × R). Furthermore, the delta hedging strategy is H t := ∂S , which is the most widely-used hedging strategy with a mathematically tractable structure. Detailed discussion about the hedging strategy is provided in the next subsection.

Continuous Hedging Strategy
We assume the existence of a continuous-time trading strategy H. If continuous-time hedging was possible, agents in the market would like to follow this strategy. This strategy may be chosen in several ways and may not lead to a perfect replication when markets are incomplete. In this study, we do not address the relative advantages of different choices of H. Rather, we suppose its existence from another generalised jump-diffusion process that satisfies the same set of assumptions as S. As such, we assume that: under P, where a t , b t , and γ t are the parameters of the process that will be determined below. By applying the Itô Lemma to the definition of H t , we can show the following decomposition: Then,

Asymptotic Hedging Error
In what follows we drive the asymptotic distribution of the hedging error, generalising the methodology proposed in Hayashi and Mykland (2005) and Tankov and Voltchkova (2009).
Then the processes S n and H n are Lévy-Itô processes with bounded coefficients and bounded jumps that coincide with S and H on the set: Since all processes are supported cádlág, P[Ω n ] → 1. The continuous re-balancing of a portfolio is practically unfeasible. Typically, holders of a position in an option ∆-hedge in discrete time intervals of t i = iT/n. Therefore, the trading strategy is piecewise constant and given by F φ n (t) , where φ n (t) = sup{t i , t i < t}.
The value of the hedging portfolio at time t is V 0 + t 0 H s − dS s with continuous hedging and V 0 + t 0 H φ n (t) dS s with discrete hedging. Then the asymptotic distribution of the difference between discrete and continuous hedging is: where n → ∞. For any process A we set A n t := A t − A φ n (t) . Under the above conditions, Hayashi and Mykland (2005) provides a thorough discussion on the stable convergence of the bounded processes to their respective original processes.
Furthermore, define the renormalised hedging error process by: Let { W t } t∈T be a standard Brownian motion independent of W and N, and let (ξ k ) k≥1 and (ξ k ) k≥1 be two sequences of a standard normal random variable and (ζ k ) k≥1 sequence of independent uniform random variables on [0,1], such that the three sequences are independent from each other and other random elements. Let T i i≥1 be an enumeration of the jump times of N, and define: Then, applying Theorem 1 in Tankov and Voltchkova (2009) to the renormalised discrete delta hedging error Z n t gives the following asymptotical convergence result in finite-dimensional laws:

Estimation of the Density of the Hedging Error
Let Z T be a realisation of the hedging error in (12) at date T (say the maturity date of the option). Suppose that we have a random sample {Z(j) : j = 1, . . . , n} of n i.i.d. observations from Z T , each with pdf f Z and cdf F Z on a support Ω ⊆ R. Note from (12) that Z T always has a continuous density almost everywhere (a.e). In this section, the VaR and the ES associated to Z T are of inferential interest. The VaR associated to Z T at level α is defined as: while the expected shortfall is given as: where z α = inf{z ∈ R : F Z (z) ≥ α} is the lowest α-quantile and 1 A (z) = 1 if x ∈ A and 1 A (z) = 0 else. The dual representation of (14) is given by: where Q is absolutely continuous such that dQ dP ≤ α −1 . If the density of Z T is continuous, then the expected shortfall is equivalent to the tail conditional expectation defined by TCE α (Z) = E[−Z T |Z T ≤ −VaR α (Z T )] . If F Z (or f Z ) were given, then the computation of VaR α (Z T ) and ES α (Z T ) is straightforward from (13)-(15). This is unfortunately not the case, and one has to resort to estimation techniques to approximate them. Suppose that we have a consistent estimate of f Z , sayf Z . So, we can also estimate its cdf and compute the estimate of the value-at-risk and the expected shortfall as: Our main goal is to find an estimatorf Z of f Z that captures the maximum uncertainty in Z T . To achieve this goal, we will use the information entropy approach.

Information Entropy and Density Estimation
The information entropy associated with Z T is defined as: where, by convention, we assume that 0 × ln(0) = 0. I E (Z T ) is a measure of the information carried by Z T . As data are communicated more, they are corrupted with more noise so that the entropy increases, therefore they carry less information.
Let g (k)

Z
: Ω → R (k = 0, 1, 2, . . . ) be a moment function of Z T . Then, the moment of Z T with respect to g (k) Z is defined as: In practice, polynomial functions are often used for the moment function (for example, see Zellner and Highfield (1988)). g Z (z) = z k , k = 0, 1, 2, . . . , with the normalisation g (0) Z (z) = 1. In this case, we have: For the remainder of the paper, we assume the sequence µ k k≥1 satisfies the Carleman's condition [see Akhiezer (1965)], i.e., Note that when Ω = R and g (k) Z (z) = z k for all k = 0, 1, 2, . . . , condition (20) is sufficient for the determinacy of the Hamburger moment problem.
Due to issues associated with using many moments, we only use k = 0, . . . , m moments in (19), and our goal is to find the density function f Z (z) such that: Z . For now, we assume that the number m of selected moments of the pdf in (19) that have been chosen to match the empirical moments is fixed. Later on, we discuss how m can be selected in a data-dependent manner based on a Bayesian-or Schwartz-type information criterion. Since Z (z) = 1, we have m = 4 and problem (22) becomes: For the remainder of the paper, we call the solutionf Z (z) of problem (22) the maximumentropy estimator of f Z (z). We will now establish that such a solution exists and is unique. Let, denote the Lagrange function associated with (22), where λ 0 , . . . , λ m are the Lagrange multipliers. By noting that solving (22) is equivalent to static problem: it is sufficient to show that (25) has a unique solution with respect to f Z ∈ L 1 (Ω) : f Z ≥ 0. More often, the minimisation over f Z is carried out via the first variation of L( f Z , λ 0 , . . . , λ m ) with respect to f Z . This type of calculation is misleading because L( f Z , λ 0 , . . . , λ m ) has support on the set f Z ∈ L 1 (Ω) : f Z ≥ 0 a.e. , and the complement of this set is dense in L 1 , meaning that not only is L( f Z , λ 0 , . . . , λ m ) nowhere differentiable in L 1 , it is also nowhere continuous. Due to these reasons, we follow the convex duality approach in Borwein et al. (2003). From (Borwein et al. 2003, Theorems 1-2), there is a unique solution f Z of (25) satisfying: where λ = (λ 0 , λ 1 , . . . , λ m ) , θ = exp(1 + λ 0 ), and the zeroth-moment equality implies that: To complete the closed form of f Z (z; λ), we must substitute λ in (26) by an optimal valueλ. Zellner and Highfield (1988) consider the case in which g (k) Z (z) = z k where k = 1, . . . , m = 4, and use an algorithm based on a Newton method to compute λ 1 , . . . , λ 4 from the restrictions in (22). This numerical approximation is cumbersome even for a moderate choice of m = 4. In this paper, we propose the maximum likelihood (ML) method to estimate λ 1 , . . . , λ m , and θ.
Since {Z(j) : j = 1, . . . , n} are i.i.d. with common pdf 1 θ exp − ∑ m k=1 λ k g (k) Z (z(j)) , the likelihood function of the sample can be written as: The log-likelihood function from (27) is then given by (Z(1) and we can prove the following on the existence of unique solutions for both problems (28) and (22).
. . , m. Then: (a) Problem (28) has a unique solution with respect to λ; (b) Problem (22) has a unique solution with respect to f Z (·).

Choice of m and the Moment Functions in the Density Estimation
To avoid the problems associated with using many moments, we suggest using a finite number m of moments in the estimation of the density function f Z (z). In practice, the choice of m, as well as that of the moment functions g (k) Z (·), k = 1, . . . , m, may not be obvious. In this section, we briefly discuss how both m and g (k) Z (·), k = 1, . . . , m can be approximated in a data dependent manner. First, note from (26) that the density function f Z (z; λ) satisfies: For the choice of moment functions, we replace each g (k) Z (z) in (33) with its truncated Taylor series expansion of around the expectation of z. From this, the problem of looking for the proper g (k) Z (z), k = 0, 1, . . . , m, moment functions is the same as finding an optimal order of truncation for each of the expansion: where β 0 , β 1 , β 1 , β 3 , . . . are unknown coefficients to be estimated. We are interested in finding m and the truncation order l 0 of the power series in (34). So, the search for m and the moment functions is converted to a search for an optimal truncation orders m and l 0 , that yields the best fit off Z (z;λ) to the data. We suggest using Bayesian information criterion (BIC) or Schwartz information criterion (SIC) based on (34) to select the optimal m and l 0 . Letλ MLE denote the ML estimator of λ using the optimal order of truncation l 0 . Then, the estimated pdf of f Z (z) isf Z (z;λ MLE ), which can then be used to compute VaR α (Z) and ES α (Z) in (16).

Numerical Analysis
In this section we conduct a numerical experiment to analyse the sensitivity of the hedging error with respect to model parameters. In the previous sections we have defined a general jump-diffusion process with the jump component specified by a kernel biased completely random measure. This generalised framework nests a number of very important models in mathematical finance, including, but not limited to, the jump diffusion model of Merton (1976), the generalised gamma process discussed in Lo and Weng (1989), the variance gamma process by Madan et al. (1998), and the CGMY model of Carr et al. (2002).
Here, for simplicity, we only use the generalised gamma (GG) process. The analysis can be easily extended to other classes of models or even their Markovian regime switching versions discussed in Fard and Siu (2013).
The GG process generalises several famous models in finance. For example, the inverse gamma (IG) and weighted gamma (WG) processes are special forms of the GG process. Let α ≤ 1 denote a constant shape parameter and δ(t) be the time-dependent scale parameter of the GG process. We can express the intensity process of the GG process as: where Γ(·) is the gamma function. We note that this process simplifies to a WG process when α = 1 and to a IG process when α = 1 2 . Several observations are of order. First, to obtain the GG process, we must set the kernel function h(z) to cz q , for some constants c and q, and choose a particular parametric form of the compensator measure. Second, different versions of GG processes are obtained for different sets of the kernel function parameters values. For example, if c > 0 and q = 1 (linear kernel function), then we obtained the scale-distorted version of the GG process. If c > 1, then the jump sizes are overstated, while they are understated if 0 < c < 1. When q > 0 and c = 1, we obtain the power-distorted version of the GG process. When q > 1, small jump sizes (i.e., 0 < z < 1) are understated and large jump sizes (i.e., z > 1) are overstated. Finally, when 0 < q < 1, small jump sizes are overstated and large jump sizes are understated.
To simulate the GG process, we use the Poisson weighted algorithm by Lee and Kim (2004). Throughout the simulations, we set at α = 1/2. It is worth mentioning that the Poisson weighted algorithm applies to a wide class of completely random measures, which are very difficult to simulate directly, see Lee and Kim (2004).
To implement the algorithm, divide the time interval to maturity [0, T] into nT equally spaced subintervals. Then for each j = 0, 1, ..., n − 1, let [t j , t j+1 ] be the (j + 1)st subinterval. Let M denote the number of jumps of the completely random measure over the term to maturity, such that, M controls the accuracy of the approximation of the algorithm. To implement the Poisson weighted algorithm, we take the following steps: Step 1.
Generate the jump size Ω i from the conditional density function g T i (i.e., gamma); Step 4.
Evaluate whether T i ∈ [t j , t j+1 ). If yes, calculate: λ i is the intensity of the Poisson distribution, used to generate the Poisson weights W.
We provide a comparative analysis of the density function of the hedging error obtained by our ME estimation of the hedging error with that obtained by Monte Carlo simulation. In Monte Carlo simulation, we calculate the hedging error at t n , ∆Π n , n = 1, . . . , N, with hedging portfolio Π n := Π(t n , S) defined by: where the option price and delta are computed by evaluating the risk neutral expectation in Equation (8).
We consider hedging a one-month at the money call option with S(0) = K = 1. We analyse different hedging frequencies, namely, N = 4 (weekly hedging), N = 20 (daily hedging), N = 80 (hedging every hour and a half), and N = 320 (hedging every 20 min). For brevity, we suppose that the WG process with c = q = 1, equal model parameters under P and P θ , zero interest rate, and zero drift for the underlying asset. Furthermore, we assume constant volatility, σ = 18%. Then, the set of rebalancing times is {t n }, t n = nδt, n = 1 . . . , N with δ = 0.1. All computations are fully vectorised in JULIA TM .
In this setting, the kernel-biased completely random measure on T is . It can be shown that the process generated by the Poisson weighting algorithm converges in distribution to κ(t) defined on [0, T] with the Skorohod topology as M → ∞ (see Lee and Kim 2004). Note that the true values of the truncation lag (m) and truncation order (l 0 ) in the entropy density are set at m = l 0 = 4. The optimal choice of these parameters with the BIC criterion varies within the pseudo-samples, but the minimum and maximum choice of each parameter are 2 and 5, respectively. Figure 1 compares the distribution of hedging error obtained from the Monte Carlo Simulation and estimated from the ME method. In Table 1 we report the summary statistics for the distribution of hedging error, as well as the 99% and 95% VaR, obtained by the above two methods. From the numerical analysis, it follows that the distribution of hedging error for a small number of trades has a high volatility and is negatively skewed, which indicates that it is more likely to have large losses than large wins. It is noteworthy that the ME estimation of the distribution fails to sufficiently capture the right tail of the empirical distribution, nevertheless it adequately describes the left tail.  As we increase the frequency of the delta hedging trades, the distribution becomes more symmetric, and our ME estimation performs better. For N = 20, our distribution is strongly leptokurtic, indicating the option writers' large exposure to jumps. When trading frequency increases to 80 and 320, the distribution peaks around zero and the volatility significantly decreases, however, the left tail is still noticeable. From our analysis, it appears that the volatility is negatively related to the frequency of delta hedging trades. However, further research is required to conclusively establish the relationship between the number of trades and moments of the distributions.

Conclusions
Perfect replication of financial derivatives is not possible, given market incompleteness and discrete-time hedging. We characterised the risk in dynamic hedge of European options through the asymptotic distribution of hedging error. Furthermore, we obtained an estimation for the distribution of hedging error by maximising Shannon's entropy (Shannon (1948)) subject to a set of moment constraints, which in turn yielded the value-atrisk (VaR) and expected shortfall (ES) of the hedging error, two widely-used risk metrics in finance. This new approach chooses the probability distribution with the most uncertainty subject to what is known. Thus we obtained a consistent estimate of the asymptotic distribution of hedging error, despite the non-normality of the underlying distribution of returns. As a result, we derived a very generalised modelling framework, which can be applied in different areas of derivatives pricing. Some parametric specifications of this framework include, but are not limited to, the jump diffusion model of Merton (1976), the generalised gamma process discussed in Lo and Weng (1989), the variance gamma process by Madan et al. (1998), and the CGMY model of Carr et al. (2002).
Finally, we conducted a robust numerical simulation of the result, to highlight the practical applications of our model. Acknowledgments: The authors would like to thank Mervyn Silvapulle of the Department of Econometrics and Business Statistics at Monash University, for his insightful comments and helpful suggestions. Firmin Doko Tchatoka acknowledges the financial support from the School of Economics, the University of Adelaide.

Conflicts of Interest:
The authors declare no conflict of interest.
Appendix A. Proof of Lemma 1 James (2002James ( , 2005 shows that: Then, by taking the conditional expectations of (3), the results follow.