Asymptotically Normal Estimators of the Ruin Probability for Lévy Insurance Surplus from Discrete Samples

A statistical inference for ruin probability from a certain discrete sample of the surplus is discussed under a spectrally negative Lévy insurance risk. We consider the Laguerre series expansion of ruin probability, and provide an estimator for any of its partial sums by computing the coefficients of the expansion. We show that the proposed estimator is asymptotically normal and consistent with the optimal rate of convergence and estimable asymptotic variance. This estimator enables not only a point estimation of ruin probability but also an approximated interval estimation and testing hypothesis.


Introduction
Ruin probability has been one of the central topics for long time in insurance mathematics since the paper by Lundberg (1903), where a compound Poisson type surplus was supposed.After him, various stochastic surplus models have been considered, and we found that Lévy processes seem to be good candidates for insurance surplus models from several aspects: (1) computational convinience; (2) compatibility with financial theories and dynamical risk managements; (3) statistical prediction of the future surplus.On the aspect (1): Lévy process has properties of independent and stationary increments, and it derives many beautiful mathematical formulae for ruin probability and other ruin-related quantities via the fluctuation theory of Lévy processes; see Huzak et al. (2004), Feng and Shimizu (2013), and Kyprianou (2014), among others.On (2), Trufin et al. (2011), Shimizu and Tanaka (2018) proposed dynamic risk measures based on ruin probability and its related quantities, which are useful not only in insurance but also financial mathematics.See also Schoutens and Cariboni (2009) for relations to credit risk modeling.In this paper, we focus on the aspect (3), which is the most important step to make the ruin theory applicable in practice.

Ruin Probability Under Lévy Surplus
On a stochastic basis (Ω, F , (F t ) t≥0 , P) with usual conditions, let X = (X t ) t≥0 be an (F t )-Lévy process of the form where c > 0 and σ ∈ R are constants, W is an (F t )-Wiener process, and L is a spectrally positive (F t )-Lévy process with the Lévy measure ν.We assume that the Laplace exponent of L, say, ψ L (u) := t −1 log E[e −uL t ], is given by This implies that ν satisfies the condition 1 0 z ν(dz) < ∞. (1) Assume that −X is a risk process of an insurance company, where the constant c > 0 corresponds to the premium rate and L − σW corresponds to the randomness in insurance business-aggregate claims, frequent costs, and uncertainties of premium income, for example.If ∞ 0 ν(dz) < ∞, then L is a compound Poisson process that corresponds to the aggregate claims process only.If ∞ 0 ν(dz) = ∞, then the process L has many infinitely small jumps in any finite time interval.In such a case, "large" jumps of L are interpreted as "large" claims, and the small jumps are approximations of other uncertainties of costs and some other businesses as well as "small" claims that frequently occur.Therefore, it would be natural to assume that c > 0 is known and the constant σ and the Lévy measure ν are unknown.
When the company has the initial surplus x ≥ 0, the ruin probability is given by φ(x) = P(x + X t < 0 for some t > 0), x ≥ 0. (2) The properties of the function φ is studied in Huzak et al. (2004) in detail under general Lévy insurance risks.See also Biffis and Morales (2010) for more general Gerber-Shiu functions.As is well known regarding the property of Lévy processes, the ruin probability satisfies φ(x) < 1 if and only if the following net profit condition holds true: (3) Otherwise, φ(x) ≡ 1; see, e.g., Kyprianou (2014), Theorem 7.2.Under the conditions (1) and (3), the function φ satisfies a defective renewal equation (DRE) as given in Proposition 1, which easily leads to important results of φ, such as the Laplace transform, the Pollaczeck-Khinchine formula, and the Cramér-type approximation such as x → ∞, among others.In this paper, the DRE is essential for the construction of a statistical estimator of φ later.With a DRE approach for φ, the conditions (1) and (3) are necessary, and cannot be relaxed; see Remark 1.Hence, these conditions are assumed throughout the paper, even where not specifically mentioned.

Earlier Works on Estimating Ruin Probability
The ruin probability φ depends on some unknowns: σ and functionals of ν.This motivates actuarial researchers to estimate φ from past surplus data over a long time interval [0, T n ], where Recently, many authors have made contributions to statistical estimation of ruin-related quantities under not only classical compound Poisson risks but also Lévy insurance risks.Shimizu (2011), in the first statistical work on ruin-related quantities under a Lévy process with infinite activity jumps, uses a "regularized" Laplace inversion of an empirical estimator of the Laplace transform of the Gerber-Shiu function.The idea of estimation by regularized inversion is credited to Mnatsakanov et al. (2008), who considered a classical risk model for the estimation of φ.The proposed estimators are consistent in the sense of the mean integrated squared error with the rate of convergence log T n .However, this rate is slower than the expected ideal rate √ T n in this context, and a finite sample performance gets worse; see Zhang (2016) for some numerical experiments.This is due to the "regularized" Laplace inversion, where some tuning parameter is needed to avoid the ill-posed problem of Laplace inversion; see Carroll et al. (1991) and Chauveau et al. (1994) for details.See also Shimizu (2012).To overcome this problem, Zhang and Yang (2013) consider the Fourier inversion of an empirical Fourier transform of the ruin probability.Thanks to the one-to-one properties of Fourier transform in L 2 (R)-space, their estimators can realize a better rate of convergence T a/(2a+1) n for some a > 0.Moreover, the Fast-Fourier Transform (FFT) algorithm allows easy computation of their estimators.See also Shimizu and Zhang (2017) for estimation of the Gerber-Shiu function, where the rate T n / log T n is realized.
A most recent paper by Zhang and Su (2017) introduces a new idea of estimating the ruin probability φ (they actually deal with the Gerber-Shiu function) under a compound Poisson risk model.They estimate the partial sum φ where ζ k is the kth-order Laguerre function, provided later in (10).They evaluate the L 2 (R + )-error of an empirical estimator of P k .Letting φ K be their estimator of φ K , they show that there exists some r > 0 such that Taking K = T 1/(r+1) n so that the last order is minimized, we have the optimal rate of convergence: Note that r is the parameter introduced in the definition of the Sobolev-Laguerre space W(R + , r, B); see Section 2.1.Furthermore, it follows from Zhang and Su (2017) that r can be taken arbitrarily large when φ is a combination of exponentials.Hence, in some "good" case, the constant r > 0 can be taken arbitrarily, and the rate in (6) becomes close to √ T n .Thus, the earlier works only consider the consistency of their estimators with the rate of convergence, but not the asymptotic distribution of φ K .This paper considers the same type of estimator as in Zhang and Su (2017), but under a Lévy risk process that is possibly of infinite activity.We show the asymptotic normality of the estimator of φ K with the rate √ T n : for each x ≥ 0 and K ∈ N, where the asymptotic variance Σ > 0 is also estimable from the surplus data.Since φ K approximates to φ in any order, the asymptotic normality enables us to construct a confidence interval to test the hypothesis for φ with approximate results.

Statistical Setting and General Notation
In the statistical argument, we assume that the surplus X is observed in a time interval [0, T n ] at discrete time points, t n i := i∆ n (i = 0, 1, 2, . . ., n) with ∆ n > 0: Note that T n = t n n .Moreover, we also assume "large" claims from L. That is, for a given constant n > 0, we observe , and we do not use "small" jumps.Then, our observations consist of Later, we consider the asymptotic property that, as n → ∞, which is an ideal situation where most data for X are available at the limit.This is the setting that we should consider at first in inference for continuous-time stochastic processes.As a practical motivation, we would like to estimate the ruin probability φ(x) from a data set D n .Moreover, we use the following notation in the paper: Moreover, stands for the transpose A = (a ji ) 1≤j≤q,1≤i≤p .

•
For each k ∈ N, 0 k is the zero vector in R k .Moreover, O k and I k are the k × k-zero matrix and identity matrix, respectively.

•
For functions f and g, f g means that there exists a constant C > 0 such that f (x) ≤ Cg(x) for all x.
• For s ≥ 0 and f ∈ L 1 (R + ), L stands for the Laplace transform operator • f * g stands for the convolution of f and g: In particular, as h Moreover, k θ is its density function:

The Laguerre Expansion of φ
Under the net profit condition (3), it is well known that the ruin probability φ given in (2) satisfies a defective renewal equation (DRE).
The case where θ = 0 follows from Lemma 5 where the limit θ → 0 in the Equation ( 9) is taken with θ > 0.
Remark 1.Note that ∞ 0 g θ (x) dx < 1 from (25) in the proof of Lemma 3, which means the renewal-type Equation (9) is defective.This DRE is essential to construct an estimator of φ as is seen below.The condition (1) is necessary to get the DRE, and we cannot include the case where 1 0 z ν(dz) = ∞ in this statement; see Feng and Shimizu (2013), Lemma 3.1 and its remark.
Let L k (x) be the (normalized) Laguerre polynomial of order k, defined as and let ζ k be the Laguerre function of R + , given by The functions {ζ k } k∈N 0 are known to form a complete orthogonal basis of where For each K ∈ N 0 , we denote by -dimensional column vectors of coefficients for their expansions.
By substituting the expression (11) into the defective renewal Equation ( 9), using a "convolution formula" for ζ k 's such as and comparing the coefficient of ζ k 's, we have the following relations among p K , q K and r K .
Then, it holds for any K ∈ N 0 that In particular, the matrix A K is invertible, and the elements a ij 's are uniformly bounded. Let a (K + 1)-dimensional row vector of the Laguerre functions.Then a "truncated version" of the Laguerre expansion of φ, say, φ K , is defined as For constants r, B > 0, denote by W(R + , r, B) the Sobolev-Laguerre space: According to Zhang and Su (2017), if φ ∈ W(R + , r, B) for some r, B > 0, then it follows that for each K ∈ N 0 .This implies that if K large enough, φ approximates to φ K with arbitrary accuracy in the sense of L 2 (R + ).Under some regularities on φ, we can also show that φ K converges to φ uniformly on R + as K → ∞.The following result suggests a uniform convergence of the Laguerre expansion in the Sobolev-Laguerre space.
Proposition 3. Let f ∈ W(R + , r, B) with r > 1, and let f K be the partial sum of the Laguerre expansion of f : Then, it follows that and applying the Cauchy-Schwartz inequality, we have

Coefficients Q k and R k
The coefficients Q k and R k (k ∈ N 0 ) can be represented as follows: where, for θ > 0, and for θ = 0, Proof.For θ > 0, by a standard application of the Fubini theorem, we have Similar calculations can yield the results for θ = 0.
The expressions of H Q k,0 and H R k,0 can also be obtained with the limit defined as θ ↓ 0. From the assumptions in Proposition 4 and Lemma 5, it follows for each z ∈ R + that Here, we provide some of the properties of H Q k and H R k , which will be discussed later.
Lemma 1.Let Θ be a bounded and compact subset of (0, ∞).Then, it follows for each z ∈ R + that Proof.Without loss of generality, we may suppose that Θ ⊂ Similarly, we also have Lemma 2. Let Θ be a bounded and compact subset of (0, ∞).Then, it follows for each z ∈ R + and κ ∈ Θ that Proof.As in the previous proof, we may suppose that Θ ⊂ Then, we have By the same argument as above, we also have This completes the proof.

Statistical Inference
Our goal is to estimate φ K for a given K ∈ N from observation D n as in ( 7) and investigate the asymptotic behavior under the observation scheme (8).The strategy is to estimate the coefficients of the Laguerre series of φ, which essentially consist of the functionals of the Lévy measure ν as well as the diffusion coefficient θ.For that purpose, we will first introduce a few general tools, namely, some statistics and their limit theorem.
Let Θ be a parameter space for θ, which is a bounded and compact subset of R. Hereafter, we suppose that the true value of θ, say, θ 0 , is positive (σ > 0) if we consider a diffusion perturbation model.Otherwise, we suppose that θ 0 = 0 is known, and the treatment becomes much easier in this case.
We assume that there exists a known constant > 0, which is small enough such that and that θ 0 belongs to the interior of Θ: θ 0 ∈ int(Θ).We also put β 0 = c/θ 0 , and note that Hereafter, we always assume the asymptotics (8) as n → ∞:

Estimating the Lévy Characteristics
According to Proposition 4, we should estimate the functionals of the form ν(h).In this paper, we use semiparametric-type estimators for those functionals, proposed by Shimizu (2011).
Let µ be a jump-counting measure associated with the spectrally positive Lévy process L = (L t ) t≥0 : and put µ(dt, dz) = µ(dt, dz) − ν(dz) dt, the compensated measure, such that the process For estimation of the functional ν(h) from the data J n ( n ), Shimizu (2011) proposes the following estimator: θ ) : R + → R d , for each θ ∈ Θ.The following results are credited to Shimizu (2011).
Then, it follows for the matrix To estimate the diffusion coefficient θ = σ 2 /2, we use the results obtained by Jacod (2007); see also Shimizu (2011), Lemma 3.1 and Remark 3.2.Proposition 7. Suppose that, for ξ n = n 0 z 2 ν(dz), Then, the statist , is a consistent estimator of θ 0 for any constant t ∈ [0, T n ] with a more rapid rate of convergence than 1/ √ T n :

Joint Convergence and Asymptotic Normality
Since θ t is consistent with θ 0 for any fixed t > 0, we omit the superscript t in the sequel: In practice, it would be better to take as large a t as possible to use a sample of sufficient size.
Considering the discussion in the last section, it would be natural to estimate Q k 's and R k 's, respectively, from where for each θ ∈ Θ.
Proposition 8. Consider the condition (15) and suppose that there exists some δ > 0 such that Then, it follows for any k ∈ N 0 that Proof.Since θ is consistent with the true value of θ under our assumption, it follows that Moreover, note that for any subsequence of θ P −→ θ 0 , there exists a further subsequence θ , as we see from Lemma 1 and the assumption that as n → ∞ by the Lebesgue convergence theorem.Since these hold true for any subsequence of ν(H Q k (•, θ)), we also see that as n → ∞.
To show the consistency of ( Q k , R k ), we use Proposition 5. Thanks to Lemmas 1 and 2, we immediately see that the conditions in Proposition 5 hold true.Therefore, it follows for any > 0 that as n → ∞, by Proposition 5, ( 17) and ( 18).
The consistency of R k is similarly proved.In particular, the convergence of , and is therefore omitted.As for the convergence of Lζ k which is integrable and independent of the sample size n.Hence, it follows from the dominated convergence theorem that Lζ k ( β) → p Lζ k (β 0 ).This completes the proof.
For each K ∈ N 0 , let Proposition 9. Consider the conditions (15), ( 16), and suppose that Then, it holds for any K ∈ N 0 that where Proof.Without loss of generality, we can show the statement as K = 0, that is, We can then conclude from Proposition 5 that Similarly, we also see that 16).Hence S (1) n P −→ 0, which completes the proof.

Main Theorems
It follows from Proposition 2 that Since the matrix A K consists of {Q i } i=1,...,K , a natural estimator of p K is given by be an estimator of the function φ.We then have a weakly consistent φ K .
Theorem 1. Suppose the conditions (15) and (16).It then follows for each K ∈ N 0 that Proof.Note that , the uniform consistency of (21) holds as follows: Theorem 2. Suppose the same assumptions as in Proposition 9.It then follows for any K ∈ N 0 and x > 0 that with the lower triangle matrix P * K given by , (i, j = 0, 1, 2, . . ., K).
Proof.First, we shall show that p K = A −1 K r K is asymptotically normal for each K ∈ N 0 .Noticing the equality that Note that, from Proposition 2, the kth component of the (K + 1)-dimensional vector for each k ∈ N, where and we assume that ∑ 0 j=1 ≡ 0 as a convention.Consider the following (K + 1) × (K + 1)-lower triangle matrix P * K : where P * K is the limit in probability, the existence of which is shown in the proof of Theorem 1, (23).Using this matrix P * K , we see that Therefore, Proposition 9 and Slutsky's lemma connote that This completes the proof.
Remark 3. We can construct a consistent estimator of the asymptotic variance for φ K (x) by the statistics p K , q K , and r K although the representation will be complicated.Therefore, the asymptotic normality result of Theorem 2 enables us to construct a confidence interval to test the hypothesis for φ K (x).If φ ∈ W(R + , r, B) for r > 1 and B > 0, then with a large enough K φ K is uniformly close to the true φ on R + .Therefore, the confidence interval for φ K would be an approximated confidence interval for φ.

Simulations
We shall try some numerical example to confirm the asymptotic normality of our proposals.We consider the following two models for finite and infinite activity jumps: (CP) Compound Poisson model: for c = 15 where N is a Poisson process with the intensity λ = 12, and U i 's are IID random variables with an exponential distribution with mean µ = 1; the Lévy density ν(x) = λµ −1 e −x/µ (x > 0), and set σ = 1.In the simulation, we suppose that λ, µ, and σ are unknown.In this case, the ruin probability is explicitly known as (GS) Gamma subordinator model: for c = 1 where L is a gamma process with the Lévy density ν(x) = x −1 e −γx (x > 0) with γ = 20, and set σ = 1.In the simulation, we suppose that σ and γ are unknown.In this model, the ruin probability is not explicit, but we can compute it numerically, e.g., via the Fast Fourier Transform; see, e.g., Zhang and Yang (2013).
To observe the asymptotic normality of the proposed estimators of φ K (x), we show QQ-plots for φ K (x) with K = 10 and x = 1, 3, 5 by 300 replications under a sampling setting (7) with ∆ n = 1/2T n and n = 2/T n , and compare the results among the different values of the initial reserve x = 1, 3, 5 and T n = 120, 360 in the sequel.The results are given in Figures 1-3 for Model (CP), and Figures 4-6 for Model (GS).
Most of the results manifest asymptotic normality as the value of T n becomes large.As for the case of φ K (5) with T n = 360 in Figures 3 and 6, the right tails still seem not to converge to the normal distribution.Although we cannot explain this phenomenon quite well, it might be due to the value of n selected, which significantly affects the estimation of parameters.How to choose n in practice is an important problem, but this is beyond the scope of this paper.It is a theme that merits serious consideration by researchers.

Concluding Remarks
In this paper, we consider the statistical inference for ruin probability of Lévy insurance surplus under a certain sampling scheme.The samples consist of a mixture of n-discrete samples of the surplus, which are assumed to be a book record of the (e.g., daily) surplus, and a 'large' jumps that are insurance claims larger than a certain threshold n > 0.
We consider the Laguerre expansion of the ruin probability, which is the series expansion based on the Laguerre functions in (10) and the coefficients are obtained in explicit form that includes unknown quantities: the diffusion coefficient D = σ 2 /2 and functionals of Lévy measure of the form ν(H) = R H(z) ν(dz).Those unknowns are estimable from our sampling data, and we showed the asymptotic properties of those estimators, which leads us the asymptotic normality of the estimated partial sum of Laguerre expansion of the ruin probability as n → ∞ and n → 0 as well.The asymptotic distribution enables us to construct the asymptotic confidence intervals of ruin probability, which would be important to apply the ruin theory in practice.
In this paper, we assumed that n → 0 and that we can observe all the jumps that are larger than n , which means that we can observe all the infinitely many jumps in the limit.Of course, such a situation is not realistic, but this paper investigates the rate of convergence and the possibility of the asymptotic normality of the estimators under a kind of ideal situation.We clarified the speed of n that goes to zero as in Proposition 9, which should be the first step to be specified in the theory of statistical inference.Note that the rate condition on n is only for theory, but is not checkable in practice as always in asymptotic statistics.In the simulation, we use n = 2/T n as an example that satisfies the asymptotic conditions in Proposition 9.However, in practice, the value of n is naturally determined, e.g., the value of deductible if it exists, or the smallest jump size within the observations.The asymptotics that n → 0 is a kind of approximation for the real situation: the theory ensures the statistical validity of our estimators if the value of n is practically 'small' enough and if we assume that the observed surplus is a realization of a Lévy process we assumed here.In this context, we may need "a new aspect" for the surplus model as described in Shimizu (2009).
Proof.Note that, from (1), it follows that  Moreover, note that π d (x) = π −1 ∞ ν(x) is a probability density function, and g θ (x)π −1 ∞ c = k θ * π d (x) is the probability density.In particular, we see that g θ is the density of a defective distribution since by (3).Therefore, Note that since the last term is a probability tail function.Hence, Λ(x) ≤ π ∞ , which yields sup As a consequence, This completes the proof.
Proof.From Proposition 1, we have the following Laplace transform of φ: The last term of right-hand side can be attributed to s = 0 since g θ , h θ ∈ L 1 (R + ) by Lemma 3. Therefore, we have Hence, φ ∈ L 1 (R + ).Considering that 0 ≤ φ(x) ≤ 1 is the probability of ruin, it follows for any p ≥ 1 that Proof.Since f is a polynomial growth function, we see, using integration by parts, that