Estimating Ruin Probability in an Insurance Risk Model with Stochastic Premium Income Based on the CFS Method

: This paper considers the estimation of ruin probability in an insurance risk model with stochastic premium income. We ﬁrst show that the ruin probability can be approximated by the complex Fourier series (CFS) expansion method. Then, we construct a nonparametric estimator of the ruin probability and analyze its convergence. Numerical examples are also provided to show the efﬁciency of our method when the sample size is ﬁnite.


Introduction
In the classical insurance risk model, the premium rate is a constant and the premium collection is a linear function of time. Obviously, this assumption is not in line with the actual operation of the company. So, it is natural to extend the classical risk model by replacing the constant premium income with a compound Poisson process. For this purpose, this paper considers the following risk model that the premium income is no longer a linear function of time, but a stochastic process represented by a random sum, that is, where, u ≥ 0 is the initial surplus, the counting process of premium collection M t obeys the homogeneous Poisson process with parameter µ > 0, claim counting process N t obeys the homogeneous Poisson process with parameter λ > 0. In addition, the amount of a single claim X 1 , X 2 , . . . is a series of continuous random variables which are independent and identically distributed in the random variable X, and the distribution function is F X , the density function is f X , the mean value is µ X . Similarly, the amount of a single premium Y 1 , Y 2 , . . . is a series of continuous random variables which are independent and identically distributed in the random variable Y, and the distribution function is F Y , the density function is f Y , the mean value is µ Y . In this model, we assume that are independent of each other, and the single premium income obeys the exponential distribution of parameter β, that is, the corresponding density function is g(y) = βe −βy , y > 0, and β is unknown. Of course, we also need safe load conditions: certain assumptions, in the face of more general situations, its explicit expression is usually not easy to obtain, and the approximation can only provide some rough information about ruin. In recent years, statistical estimation and inference have become an important means to deal with the problem of ruin probability. In recent years, a large number of studies use statistical methods of parametric and nonparametric estimation to study the ruin probability, and the sample data information used in them, such as the number of claims, the scale of individual claims, etc., are also obtained through observation in the actual operation of insurance companies. This method also gives the research certain feasibility and important practical significance. In this context, we hope to propose a nonparametric estimation method based on discrete observation data to estimate the ruin probability. At the same time, this method has good estimation properties, convergence speed and universal applicability. The general applicability here includes two aspects. On the one hand, we hope that the proposed estimation method can be applied to different risk models and different claim distribution assumptions. On the other hand, we hope that this estimation method can be extended to more risk models. If the above expectations can be met, it can be said that the research in this article is meaningful and valuable. It can not only provide corresponding theoretical guidance for different actual operating conditions, but also provide directions and references for further research by more scholars in the future.
In this paper, we shall use the CFS expansion method to estimate ruin probability in an insurance risk model with stochastic premium income. In fact, in most calculations, the expansion of CFS can make the problem simple and the formula refined, and only need to convert it to real Fourier series at the end. This paper applies this new method to the ruin field, and hopes that it can continue and enrich the research results in the field. The remainder of this paper is organized as follows. In Section 2, we introduce the CFS expansion method. In Section 3, we show how to approximate ruin probability by CFS expansion method. In Section 4, we give a nonparametric estimation of ruin probability. In Section 5, we study the approximation error caused by CFS approximation of ruin probability. In Section 6, some simulation results are given to show the effectiveness of our estimator. Finally, conclusions are given in Section 7.

CFS Expansion Method
In this section, we first give a basic introduction to the CFS expansion method. Define the function f (x) on [−π, π] to satisfy the following Dirichlet condition: (1) f (x) has a finite number of discontinuities on [−π, π]; (2) f (x) has a finite number of extreme values on [−π, π]; Then, such a function f (x) has the following CFS expansion expression: where, the Fourier coefficients are given by When faced with a more general situation, if the function f (x) is defined on [a, b] and satisfies the corresponding Dirichlet condition, through simple variable substitution, we similarly have: where In practice, by introducing an appropriate truncated integer K, we can get an approximation of the function f (x), namely: Since f (x) is a real-valued function, take the real part of both sides of the equation at the same time, we get: where, {·} represents the real part of the formula, ∑ represents the summation form where the weight coefficient of the first summation item is 1/2.
Equation (3) gives an approximation of the real-valued function that satisfies the Dirichlet condition. In the next section, we will use this conclusion to approximate and estimate the probability of ruin.

Approximate Ruin Probability by CFS Expansion Method
In this section, we will introduce how to apply the CFS method to approximate the probability of ruin.
First, we introduce the continuation of ψ as follows: At the same time, define: respectively represent the Fourier transform of ψ(u) and ψ e (u). On this basis, we have: where, F ψ(s) represents the conjugate complex number of F ψ(s).
Next, we study the process of using CFS method to approximate the ruin probability under the stochastic premium income insurance risk model. We define the function ω(u) = ∞ u f X (x)dx, the Fourier transform and Laplace transform are respectively: On the basis of these definitions, we first give the Fourier expression of ψ e (u). From Equation (17) of Wang et al. [13], We know that the Fourier transform of ψ(u) is: Mathematics 2021, 9, 982 By taking them into Equation (6), we get Similarly, for the extended ruin probability ψ e (u), only considering that it is a function on [−a, a], the CFS approximation of ψ(u) can be obtained by using CFS approximation Equation (3): where, I is an indicator function, and Further, we have: Finally, we get the approximate expression of ruin probability:

Nonparametric Estimation of Ruin Probability
In this section, we will give a nonparametric estimation of ruin probability based on the approximate results in the previous section.
It is assumed that insurance companies can obtain the following data sets: (1) Data set composed of claim number and claim amount where N T is the total number of claims in the observation interval [0, T].
(2) Data set composed of random premium income number and premium amount where M T is the total number of random premium income in the observation interval On this basis, we estimate the ruin probability. First, from Equations (7), (8), (10) and (11), we need to estimate the following parameters in order to construct the estimation of ruin probability: For λ and µ, we use the following estimators: Since the premium income Y follows the exponential distribution with parameter β, then E[Y] = 1/β. From the moment estimation of expectation, we get the estimator of β as follows: It is easily seen that: Similarly, we can also use the moment estimator of µ X to construct the corresponding estimator:μ For F ω(s), F f (s), according to the empirical characteristic function, we can get their estimators, respectively: On this basis, according to Equations (7) and (8), we get the estimator of Fourier transform F ψ e as follows:F where,F H(s) :=λ Furthermore, combining with the Equations (10) and (11), we get the estimator expression of ruin probability:ψ where,B Obviously, the estimator expression we get is as concise as that of other models, which becomes a highlight of this estimation method.

Convergence Analysis of Estimation Methods
In this section, we will analyze the calculation error of the method proposed in the previous section. First, we study the approximation error caused by CFS approximation of ruin probability. The following proposition gives the corresponding conclusion. Proposition 1. Suppose ψ(u) is second order continuous differentiable, and a = O(K), |isF f X (s)| < C, then for a very large K, we have: where c 0 is a constant defined in the process of proving the proposition, and C is also a constant.
Proof. From the Equation (3.6) of Li et al. [24], we have In order to further give the upper bound of the Equation (15), we first calculate the residual sum ∑ ∞ k=K+1 B k . Similarly, we need to study the Fourier transform F ψ e (s). If we take Equations (7) into (8), we have: Since bring it into the Equation (16) and simplify it, there are: Similarly, when |s| > 2βλ/µ, for the denominator, there is: Thus, when |s| > 2βλ/µ, we can get: In other words, when |s| → ∞ , there is |F ψ e (s)| = O|s| −2 , or in other words, there is a certain constant c 0 > 0, so that for a sufficiently large |s|, there is: Thus, the coefficient B k can be scaled to On this basis, when a = O(K), we can immediately get: For the first formula on the right side of the inequality in Equation (15), Equation (3.14) of Li et al. [24] has already given the corresponding scaling result: Thus, integrating the Equations (15), (17) and (18), we finally get:

Note:
The assumption |isF f X (s)| < C is reasonable for commonly used distributions, such as: (1) When X follows an exponential distribution with a parameter of α(α > 0), there is Next, we study statistical errors. Two theorems are proposed here, and the conclusion will help us get the final proof result.
On this basis, when = ( ), we can immediately get: For the first formula on the right side of the inequality in Formula (15), formula (3.14) of Li et al. [24] has already given the corresponding scaling result: Thus, integrating the Formulas (15), (17) and (18), we finally get:
(2) When obeys the Erlang distribution with parameter ( , )( ∈ ℕ , > 0), Next, we study statistical errors. Two theorems are proposed here, and the conclusion will help us get the final proof result.
Proof. From the expressions ofF H(s) and F H(s), we know: For L 1 , we have Let g 1,s (x) = x 0 e isu du = e isx −1 is , then L 1 can be further expressed as Since , the second formula on the right side of Equation (20) can be further proved as: In addition, from the Lemma 3.1 of Li et al. [24], the first formula on the right side of Equation (20) can be further processed as: Combining Equation (21), we get: Next consider L 2 . Bringing the specific expressions ofF ω(s),μ x and F ω(s) into L 2 , we get the following formula: Let g 2,s (x) = x 0 e isu du−x is , then For the second equation on the right side of the last equation, considering: At the same time, for the first formula on the right side of the last equation of Equation (24), we need to refer to the proof ideas in the appendix of Zhang [25], then we have: Combining Equations (25) and (26), we get: Finally, it is proved by Equations (19), (23) and (27).
Theorem 1 shows that the estimation error of F H(s) has logarithmic convergence. In order to further give the estimated error convergence rate of ψ K (u), we need to analyze F G(s) next. The following theorem will solve our considerations well.
On this basis, when = ( ), we can immediately get: For the first formula on the right side of the inequality in Formula (15), formula (3.14) of Li et al. [24] has already given the corresponding scaling result: Thus, integrating the Formulas (15), (17) and (18), we finally get:
(2) When obeys the Erlang distribution with parameter ( , )( ∈ ℕ , > 0), Next, we study statistical errors. Two theorems are proposed here, and the conclusion will help us get the final proof result.

Proof
For Π 1 , let g 3,s (x) = e isx , then further get: Since sup s∈K thus, for the second term on the right side of the last equation in Equation (29), we have: For the first term on the right side of the last equation in Equation (29), we consider introducing two real-valued function classes: g 3,k,R = {g : g = Re(g 3,s ), s ∈ K}, g 3,k,I = {g : g = Im(g 3,s ), s ∈ K}.
Refer to the proof of Wang et al. [13], we know: Combining Equations (29), (30) and (31), the maximum boundary of Π 1 can be obtained: Next consider Π 2 . Let g 4,s (x) = e isx −1 is , then: Since Next consider the first equation on the right side of the last equation of Equation (33). Due to the Lemma 3.1 of Li et al. [24], then we have: Combining Equations (34) and (35), we get: Finally, combining the conclusions of Equations (28), (32) and (36), the theorem is proved.
On the basis of the above two propositions, we propose the final estimation error conclusion of the ruin probability: On this basis, when = ( ), we can immediately get: For the first formula on the right side of the inequality in Formula (15), formula (3.14) of Li et al. [24] has already given the corresponding scaling result: Thus, integrating the Formulas (15), (17) and (18), we finally get:
Next, we study statistical errors. Two theorems are proposed here, and the conclusion will help us get the final proof result.

Proof
Because of , a simple conclusion can be drawn: Incorporating the above conclusions into Equation (37), we get: Combined with the Equation (3.15) of Li et al. [24], finally, we have:

Numerical Simulation
In this section, we will provide some simulation results to show the estimation effect of our estimation method when the sample size is limited. We set the parameters λ = 2, µ = 5, β = 1, K = 1024. In addition, let a = 30, that is 0 ≤ u ≤ 30. This is because when the initial surplus u > 30, the value of ruin probability is very close to 0, then we consider the following three claim density functions at the same time: (1) Exponential density function: f X (x) = e −x , x > 0.
Note that the assumptions of the above three claim density functions all satisfy µ X = 1, and through the Laplace transform method, the true value of the ruin probability under the first two claim distribution assumptions can be directly calculated, respectively: (1) Exponential density function: For the third density hypothesis Gamma (1.5, 1.5) density function, we also compare the approximate simulation results with the estimated results. We will consider the case of T = 120, 180, 360, and calculate the integral mean square error (IMSE), mean value and average relative error respectively based on 300 independent repeated experiments.
First of all, Figures 1-3 show the simulation effect of 25 independent repeated experiments under three kinds of claim distribution. At the same time, the corresponding effect is compared with the true value curve to illustrate the stability and accuracy of the program, and fully show the different variation bands of estimated values under different observation intervals. It can be clearly observed from the Figures that with the increase of observation time T, the estimation tends to be more and more stable.
First of all, Figures 1-3 show the simulation effect of 25 independent repeated experiments under three kinds of claim distribution. At the same time, the corresponding effect is compared with the true value curve to illustrate the stability and accuracy of the program, and fully show the different variation bands of estimated values under different observation intervals. It can be clearly observed from the Figures that with the increase of observation time , the estimation tends to be more and more stable. Next, based on the same 300 repeated experiments, Figure 4 shows the performance of the average estimation of the corresponding ruin probability under three kinds of claim distribution assumptions for different , and compares it with the truth curve. It can be seen that with the increase of parameter , the average estimated value is more and more close to the real value, and gradually coincides with the real value, which is difficult to distinguish. First of all, Figures 1-3 show the simulation effect of 25 independent repeated experiments under three kinds of claim distribution. At the same time, the corresponding effect is compared with the true value curve to illustrate the stability and accuracy of the program, and fully show the different variation bands of estimated values under different observation intervals. It can be clearly observed from the Figures that with the increase of observation time , the estimation tends to be more and more stable. Next, based on the same 300 repeated experiments, Figure 4 shows the performance of the average estimation of the corresponding ruin probability under three kinds of claim distribution assumptions for different , and compares it with the truth curve. It can be seen that with the increase of parameter , the average estimated value is more and more close to the real value, and gradually coincides with the real value, which is difficult to distinguish. First of all, Figures 1-3 show the simulation effect of 25 independent repeated experiments under three kinds of claim distribution. At the same time, the corresponding effect is compared with the true value curve to illustrate the stability and accuracy of the program, and fully show the different variation bands of estimated values under different observation intervals. It can be clearly observed from the Figures that with the increase of observation time , the estimation tends to be more and more stable. Next, based on the same 300 repeated experiments, Figure 4 shows the performance of the average estimation of the corresponding ruin probability under three kinds of claim distribution assumptions for different , and compares it with the truth curve. It can be seen that with the increase of parameter , the average estimated value is more and more close to the real value, and gradually coincides with the real value, which is difficult to distinguish. Next, based on the same 300 repeated experiments, Figure 4 shows the performance of the average estimation of the corresponding ruin probability under three kinds of claim distribution assumptions for different T, and compares it with the truth curve. It can be seen that with the increase of parameter T, the average estimated value is more and more close to the real value, and gradually coincides with the real value, which is difficult to distinguish. To further illustrate the accuracy of the proposed method, we show the average relative error curves under three claims distribution assumptions in Figure 5. Once again, To further illustrate the accuracy of the proposed method, we show the average relative error curves under three claims distribution assumptions in Figure 5. Once again, we find that with the increase of observation time T, the average relative error decreases, which also means that when T increases, our estimation method performs well and becomes more accurate. Of course, Figure 5c is different from the relative error curve of the other two distributions. The main reason is that the reference truth value used in this paper under the assumption of this distribution is our CFS approximation. More rigorous calculation method of truth value needs further study. But from this Figure, we find that when the initial surplus is in the range of 15-25, the average relative error fluctuates greatly. The corresponding average estimated value curve is proposed separately, as shown in Figure 6. It is found that in this interval, the estimated value begins to fluctuate, and there is a smaller approximation with the reference true value, which makes the performance of the corresponding relative error curve reasonable.  To further illustrate the accuracy of the proposed method, we show the average relative error curves under three claims distribution assumptions in Figure 5. Once again, we find that with the increase of observation time , the average relative error decreases, which also means that when increases, our estimation method performs well and becomes more accurate. Of course, Figure 5c is different from the relative error curve of the other two distributions. The main reason is that the reference truth value used in this paper under the assumption of this distribution is our CFS approximation. More rigorous calculation method of truth value needs further study. But from this Figure, we find that when the initial surplus is in the range of 15-25, the average relative error fluctuates greatly. The corresponding average estimated value curve is proposed separately, as shown in Figure 6. It is found that in this interval, the estimated value begins to fluctuate, and there is a smaller approximation with the reference true value, which makes the performance of the corresponding relative error curve reasonable.    To further illustrate the accuracy of the proposed method, we show the average relative error curves under three claims distribution assumptions in Figure 5. Once again, we find that with the increase of observation time , the average relative error decreases, which also means that when increases, our estimation method performs well and becomes more accurate. Of course, Figure 5c is different from the relative error curve of the other two distributions. The main reason is that the reference truth value used in this paper under the assumption of this distribution is our CFS approximation. More rigorous calculation method of truth value needs further study. But from this Figure, we find that when the initial surplus is in the range of 15-25, the average relative error fluctuates greatly. The corresponding average estimated value curve is proposed separately, as shown in Figure 6. It is found that in this interval, the estimated value begins to fluctuate, and there is a smaller approximation with the reference true value, which makes the performance of the corresponding relative error curve reasonable.   Finally, based on the above 300 repeated experiments, we give a series of IMSE values of ruin probability estimation under three kinds of claim distribution assumptions in Table 1. For each claim distribution, we find that IMSE decreases with the increase of T, which is consistent with our conclusion. This conclusion shows the stability of the estimation method in this paper, gives us a reference to improve the accuracy of data collection and provides the necessary basis for deeper application level analysis. All numerical experiments in this chapter are based on MATLAB software. Taking the exponential density as an example, when T = 120, we completed 300 independent repeated experiments in 68.096740 s.

Conclusions
This paper introduces how to use the CFS method to approximate the ruin probability under the stochastic premium income insurance risk model and gives a nonparametric estimation of the corresponding ruin probability. First, we approximate the ruin probability under the model according to the principle and method of the CFS given. Then, using the sample data set of the number of claims and the size of a single claim on the observation interval, the data set consisting of the number of random premium income and the amount of insurance premiums, a non-parametric estimation of the ruin probability is constructed. Finally, we verify the effectiveness of the method in this paper through error analysis and numerical experiments. The results show that even if it is extended to a more complex model, as long as the Fourier transform of the specific claim distribution under the corresponding model can be obtained, the method can be effectively applied. The conclusion that the numerical simulation is extremely stable not only further demonstrates the superiority of the estimation method in this paper, but also provides a necessary reference for deeper application research.