The Modelling of Auto Insurance Claim-Frequency Counts by the Inverse Trinomial Distribution

Ong, Seng Huat; Sim, Shin Zhu; Liu, Shuangzhe

doi:10.3390/jrfm18010007

Open AccessArticle

The Modelling of Auto Insurance Claim-Frequency Counts by the Inverse Trinomial Distribution

by

Seng Huat Ong

^1,2,*

,

Shin Zhu Sim

³

and

Shuangzhe Liu

⁴

¹

Institute of Actuarial Science and Data Analytics, UCSI University, Kuala Lumpur 56000, Malaysia

²

Institute of Mathematical Sciences, University of Malaya, Kuala Lumpur 50603, Malaysia

³

School of Mathematical Sciences, University of Nottingham Malaysia, Semenyih 43500, Malaysia

⁴

Faculty of Science and Technology, University of Canberra, Canberra 2600, Australia

^*

Author to whom correspondence should be addressed.

J. Risk Financial Manag. 2025, 18(1), 7; https://doi.org/10.3390/jrfm18010007

Submission received: 4 November 2024 / Revised: 20 December 2024 / Accepted: 20 December 2024 / Published: 27 December 2024

(This article belongs to the Special Issue Featured Papers in Finance and Society Wellbeing—in Honor of Professors Joe Gani and Chris Heyde)

Download Versions Notes

Abstract

In the transportation services industry, the proper assessment of insurance claim count distribution is an important step to determine insurance premiums based on policyholders’ risk profiles. Risk factors are identified through regression analysis. In this paper, the inverse trinomial distribution is proposed as a count data model for insurance claims characterised by having long tails and a high index of dispersion. Two regression models are developed to identify associated risk factors. Other popular models, such as the negative binomial and COM-Poisson, are fitted and compared to information criteria. The risk profiles of policyholders are determined based on the selected model. To illustrate the application of the inverse trinomial regression models, the ausprivautolong dataset of automobile claims in Australia has been fitted with identification of risk factors.

Keywords:

insurance premiums; negative binomial; COM-Poisson; long tail; over dispersion; transportation services

1. Introduction

A basic objective of insurance companies is to determine the appropriate insurance premium for the insured to cover a particular risk. Based upon the risk characteristics of the insured, a standard procedure to calculate the premium is to multiply the conditional expectation of the claim frequency by the expected cost of the claims (David, 2015). Evaluation of the risks to calculate an insurance premium uses regression by generalized linear models (GLM). Poisson regression is a popular model for claim counts in non-life insurance contexts but suffers from the restrictive equality of mean and variance. In practice, the data often exhibit over-dispersion, where the variance is greater than the mean. To account for over-dispersion, many alternative models have been proposed. One such family of models is the mixed Poisson distribution with negative binomial (NB) distribution as a popular choice for modelling claim frequency counts in auto insurance (Denuit et al., 2007). The NB distribution, as a mixed Poisson distribution, is formulated by allowing the Poisson mean to vary as a gamma random variable. In accident-proneness studies, as initially proposed by Greenwood and Yule (1920), the number of accidents is assumed to be Poisson distributed, but there is individual heterogeneity which is modelled by the Poisson mean varying as a gamma distribution. The NB regression model has been particularly effective in analysing motor vehicle crash data, as demonstrated in the work of Lord et al. (2005), Lord (2006) and Li et al. (2008). The unique distributional characteristics of insurance data, such as a very long tail and high kurtosis, present challenges for traditional negative binomial (NB) regression models. To better accommodate these data characteristics and improve model fit, alternatives to NB regression have been developed. For instance, Dean et al. (1989) examined the Poisson-inverse Gaussian regression model. The Poisson-inverse Gaussian distribution has a longer tail than the NB distribution. Willmot (1987) proposed this distribution as an alternative to the NB distribution.

In this paper, the inverse trinomial (IT) regression model is proposed to analyse claim-frequency count data with very long tails and high over-dispersion. The ausprivautolong dataset of automobile claims in Australia over one and two periods are characterised by their long tails and index of dispersion greater than three. Shimizu and Yanagimoto (1991) introduced IT distribution as a one-dimensional random walk distribution on a nonnegative line with an absorbing barrier at the origin, where the particle performing the random walk is allowed to remain at a position with positive probability. Aoyama et al. (2008) generalised this random walk model. NB distribution is a special case of IT distribution. As a comparison of the overdispersion, NB and Poisson-inverse Gaussian distributions have quadratic variance functions of the mean (Morris, 1982), while IT distribution has a cubic variance (Letac & Mora, 1990) function of the mean. This property adds flexibility to IT distribution when modelling over-dispersion. Phang et al. (2013) examined statistical inference in IT distribution based on minimum distance estimation with probability-generating function (Sim & Ong, 2010). Phang et al. (2013) showed that the IT distribution as a viable alternative to NB and Poisson-inverse Gaussian distributions for the empirical modelling of data. Recently, Ong et al. (2024) derived and discussed some probabilistic properties of IT distribution. Regression analysis has proven useful in various contexts, for instance, in the analysis of bus travel times in effective journey planning (Cheok et al., 2024). Due to these aforementioned advantages, it is of interest to consider the application of IT distribution to auto insurance.

This paper is arranged as follows: In Section 2, we introduce IT distribution and the formulation of the regression models. In Section 3, we describe the auto insurance claim-frequency count in Australia, the ausprivautolong dataset, and present the results of the analysis with IT distribution and identify the risk factors. Finally, in Section 4, our study shows the IT model consistently outperforms traditional NB and flexible COM-Poisson (Conway & Maxwell, 1962; Sellers et al., 2012) models, as indicated by their lower Akaike information criterion (AIC) and Bayesian information criterion (BIC) scores. This highlights the IT model’s effectiveness for long-tailed and highly over-dispersed datasets.

2. The Inverse Trinomial Distribution and Regression Models

In this section, some basic properties of IT distribution are stated, followed by a presentation of two IT regression models.

2.1. Basic Properties

In the random walk model for the IT distribution, we let the probability of a −1, 0, or +1 step be given by p, q and r, respectively. IT distribution has the probability mass function (pmf) (Shimizu & Yanagimoto, 1991) given by

\Pr (X = x) = \frac{λ p^{λ} q^{x}}{x + λ} \sum_{t = 0}^{⌊x / 2⌋} (\begin{matrix} x + λ \\ t, t + λ, x - 2 t \end{matrix}) {(\frac{p r}{q^{2}})}^{t}, x = 0, 1, 2, \dots,

(1)

where

λ > 0, p \geq r, p + q + r = 1 (p, q, r \geq 0)

,

⌊x⌋

represents the greatest integer less than or equal to x and

(\begin{matrix} x + λ \\ t, t + λ, x - 2 t \end{matrix}) = \frac{(x + λ)!}{t! (t + λ)! (x - 2 t)!} .

In terms of the Gauss hypergeometric function, the pmf is given by

\Pr (X = x) = p^{λ} {(q + 2 \sqrt{p r})}^{x} \frac{{(λ)}_{x}}{x!} {{}_{2}F}_{1} [- x, λ + 1 / 2; 2 λ + 1; \frac{4 \sqrt{p r}}{(q + 2 \sqrt{p r})}],

(2)

where

{(λ)}_{x}

is the Pochhammer function and

{{}_{2}F}_{1}

denotes the Gauss hypergeometric function.

The probability-generating function (pgf) has the following simple form (Shimizu et al., 1997):

G (u) = {(\frac{2 p}{(1 - q u) + \sqrt{{(1 - q u)}^{2} - 4 p r u^{2}}})}^{λ},

where

λ > 0, p + q + r = 1, 0 < \frac{4 p r}{{(1 - q)}^{2}} < 1 .

When

r = 0

, the NB pgf is obtained. The mean and variance of the IT distribution, respectively, with

p > r

, are as follows:

\begin{array}{l} E [X] = μ = λ \{[1 - (p - r)] / (p - r)\}, \\ V a r [X] = σ^{2} = λ \{1 - (p - r) + 2 r / (p - r)\} / {(p - r)}^{2} . \end{array}

Let

a = p - r

. The index of dispersion (ID) is as follows:

I D = \frac{V a r [X]}{E [X]} = \frac{a (1 - a) + 2 r}{(1 - a) a^{2}} = ϕ .

(3)

Note that the ID is a function of

a

and is independent of the parameter

λ

.

2.2. The Inverse Trinomial Regression Models

To apply IT distribution in regression analysis, it is assumed that the response

Y_{i}

is influenced by a set of explanatory variables or covariates. This is achieved by expressing the parameters as functions of these explanatory variables. To include the covariates, we consider a GLM with log link function. The log link for a parameter,

μ,

is given by

{l o g (μ) = x}_{i}^{T} β,

where

x_{i}

is a vector of the explanatory variables and

β

is the vector of regression parameters, i = 1, …, n.

The simple form of the IT mean facilitates the development of the regression models. With the equations

μ = λ (\frac{1 - a}{a}) = λ θ, a = p - r, θ = \frac{1 - a}{a},

we consider the log link function for the mean as the following:

μ = λ θ = e x p (x_{i}^{T} β) .

(4)

Since

μ = λ θ

, the covariates

x_{i}^{T}

may be linked either through

λ

or

θ

. This avoids the need to reparametrize the model in terms of the mean,

μ

. The regression models may be formulated as follows:

(a) λ = θ^{- 1} e x p (x_{i}^{T} β)

The parameter

λ

is regarded as a function of

θ

and the covariates

x_{i}^{T} .

This is designated IT Regression I.

(b) θ = \frac{1 - a}{a} = λ^{- 1} e x p (x_{i}^{T} β)

The parameter

θ

is regarded as a function of

λ

and the covariates

x_{i}^{T}

. That is,

θ

, as a function of

a

, is allowed to vary. Since ID is a function of

a

, this may be called the varying dispersion index regression model. This is designated IT Regression II.

For IT Regression I, regression model (a), substitute

λ = θ^{- 1} e x p (x_{i}^{T} β)

in the IT pmf. while for IT Regression II, model (b), we make the following substitution:

p = \frac{1}{1 + λ^{- 1} e x p (x_{i}^{T} β)} + r .

2.3. The Random Intercept Regression Model for Clustered Data

The regression model in the previous section may be extended to model clustered data by including a random intercept. Let

y_{i j}

be the count for cluster

i = 1, 2, . . ., m

, where m is the number of clusters, and individual

j = 1, 2, . . ., m_{i}

, where

m_{i}

is the number of individuals in the i-th cluster. This regression model, corresponding to IT Regression Model I, is defined as follows:

y_{i j} | γ_{i} ~ I T (λ_{i j}, θ)

\log (λ_{i j}) = x_{i j}^{T} β + γ_{i}

γ_{i} ~ f (γ_{i}; δ),

where

f (γ_{i}; δ)

is the probability density function of the random intercept

γ_{i}

, with the vector of parameters

δ

. The conditional pmf of individual j in cluster i is given by

P (Y_{i j} = y_{i j}| x_{i j}, γ_{i}) = \frac{λ_{i j} p^{λ_{i j}} q^{y_{i j}}}{y_{i j} + λ_{i j}} \sum_{t = 0}^{⌊x / 2⌋} (\begin{matrix} y_{i j} + λ_{i j} \\ t, t + λ_{i j}, y_{i j} - 2 t \end{matrix}) {(\frac{p r}{q^{2}})}^{t} .

(5)

2.4. Parameter Estimation

If

Y_{1}, Y_{2}, . . ., Y_{N}

is a random sample, the log-likelihood function can be written as follows:

\ln L (λ, p, r) = \sum_{i} \ln P (Y_{i} = y_{i}; λ, p, r),

(6)

where

λ > 0, p \geq r, p + q + r = 1 (p, q, r \geq 0)

. The links to the explanatory variables are given by Equation (4), either through

λ

or

θ

. Due to the complicated log-likelihood function, we consider the direct numerical optimisation of Equation (6). Examples of global optimization algorithms are the simulated annealing (SA) algorithm (Metropolis et al., 1953) and basin-hopping (Wales & Doye, 1997). Challenges to the use of these algorithms are in the choice of starting points and appropriate tuning parameters, the time used to search for the whole space and ensuring, with great confidence, the global optimum has been determined. Simulated annealing is a stochastic search algorithm, based on the physics of annealing in metallurgy, that finds the global optimum of a complex function. Basin-hopping is a global optimization technique motivated by problems in physical chemistry; it iterates between the random perturbation of coordinates to jump basins and local optimization to optimize each basin in order to get the best optimum.

In the random intercept regression model, the marginal likelihood for the i-th cluster is given by

L_{i} (β, p, r, δ) = \int \prod_{j = 1}^{m_{i}} P (Y_{i j} = y_{i j}| x_{i j}, γ_{i}) f (γ_{i}; δ) d γ_{i},

(7)

where

P (Y_{i j} = y_{i j}| x_{i j}, γ_{i})

is given by Equation (5). The distribution of

γ_{i}

is usually taken to be the normal (Gaussian) distribution

N (μ, σ^{2})

. In this case, we call this the regression model the IT-Gaussian random intercept regression model. The marginal likelihood for the i-th cluster is given by Equation (7) with

f (γ_{i}; δ) = \frac{1}{σ \sqrt{2 π}} e^{- \frac{{(γ_{i} - μ)}^{2}}{2 σ^{2}}}

(8)

where

δ = (μ, σ^{2})

. The log-likelihood function is

\ln L (β, p, r, δ) = \sum_{i = 1}^{m} l n L_{i} (β, p, r, δ) .

3. Application to the Modelling of Auto Insurance Claim-Frequency Counts

3.1. Auto Insurance Claim-Frequency Count Dataset

We analysed the ausprivautolong dataset from the dataset ausprivauto of automobile claim in Australia, included in the R package ‘CASdatasets’ (https://cas.uqam.ca/pub/web/CASdatasets-manual.pdf (accessed on 3 November 2024)). The ausprivautolong is a simulated dataset containing counts of claims from 40,000 policies over three periods (years). The simulation is based on a true non-life portfolio. Each policy is regarded as a cluster, hence there are 3 × 40,000 = 120,000 records. The variables are as follows:

IDpol is the policy identification number.
DrivAge is the age of the policyholder.
VehValue is the vehicle value in thousands of AUD.
Period is tthe period number.
ClaimNb is tthe number of claims.
ClaimOcc indicates occurrence of a claim.

The risk factors are DrivAge (driver’s age) and VehValue (vehicle value). In this study, the datasets are analysed separately by period, specifically focusing on Period 1 and Period 2. The claim-frequency count for both Period 1 and 2 has a very long tail and a high index of dispersion; see Table 1 and Table 2.

Both datasets are analysed with the IT regression models. Due to the complexity of the clustered regression model, we have considered the non-clustered case. For comparison the data are fitted with the NB and COM-Poisson regression models. The NB pmf is given by

\Pr (k) = (\begin{matrix} \frac{1}{φ} + k - 1 \\ k \end{matrix}) {(\frac{1}{1 + φ θ})}^{\frac{1}{φ}} {(\frac{φ θ}{1 + φ θ})}^{k},

where

φ

,

θ > 0

. The COM-Poisson distribution, has pmf given by

P (X = x) = \frac{λ^{x}}{(x!)^{v} Z (λ, v)},

where

Z (λ, v) = \sum_{j = 0}^{\infty} \frac{λ^{j}}{(j!)^{v}}

for

λ > 0

and

v \geq 0

. The COM-Poisson distribution is under-dispersed when

ν > 1

and over-dispersed for

ν < 1

. If

ν = 1

, the Poisson distribution is obtained.

3.2. Fit to the Auto Claim Count Data

To assess goodness-of-fit and for model selection, the IT distribution was compared to the NB and COM-Poisson models by means of chi-square goodness-of-fit statistics, log-likelihood, AIC and BIC. The fitting process was conducted using Python, employing basin-hopping as the global optimisation algorithm and sequential least squares programming (SLSQP) as the local optimiser. Basin-hopping was tuned using the parameters niter = 50, T = 0.5 and step size = 0.2. The results from the local SLSQP optimization were used as the starting point for the basin-hopping algorithm. SLSQP was specifically chosen for its suitability in handling constrained optimization problems, where both parameter bounds and custom constraints are enforced. In this case, the constraints for the model are p + q + r = 1 and p > r. The Python modules used include scipy.optimize, from which the functions “basinhopping” and “minimize” were imported for global and local optimization, respectively.

To estimate the standard errors of the parameters, we approximated the Hessian matrix of the log-likelihood function using numerical gradients. This is achieved by calculating the gradient of the log-likelihood function with small changes in the parameter values. We used the numdifftools package, available in Python, to compute the Hessian matrix numerically. The standard errors were then calculated by taking the square roots of the diagonal elements of the inverse of the Hessian matrix. The Python sample codes are given in Appendix A. Due to possible numerical issues in the Hessian matrix calculation, some alternatives, such as bootstrapping and jackknife, could provide better estimates for the asymptotic covariance matrices. See, for instance, Shao and Tu (1995). In clustered data, not accounting for clustering will result in very biased standard error estimates. For a discussion of some methods of clustered regression, see Skinner and de Toledo Vieira (2007).

Table 1 and Table 2 display the fit without covariates. Based on chi-square statistics, the IT distribution fits better than the NB and COM-Poisson distributions. The distribution of claims, irrespective of claim amounts, may be applied to build bonus–malus systems for auto insurance (Lemaire, 1976, 1985). It was observed that the count data contained many zeroes. The zero-inflated version of the NB, COM-Poisson (Sim et al., 2018) and IT distributions were also fitted, without much improvement, based on chi-square statistics.

The regression fits are shown in Table 3 and Table 4. Statistically significant covariates at a level of 5 percent are indicated with an asterisk. It is shown the IT regression models provide a better fit than the NB and COM-Poisson regressions. It appears young drivers and vehicles of low value (VehValue 50–75 kAUD) are prone to claims.

In Table 3, the total computational time reported for IT Regression I is 3170.27 s, which includes both basin-hopping and local SLSQP optimization processes. We performed 50 basin-hopping iterations (niter = 50) and assumed approximately 100 function evaluations per SLSQP iteration. Given the total computation time of 3170 s, the average time per function evaluation is estimated as 3170/50/100 = 0.634 s per evaluation. The mathematical order of computational complexity is given by O(Complexity of SLSQP × Temperature steps × Total number of iterations).

4. Conclusions

The proper assessment of insurance claim count distribution is an important step to determine insurance premiums based on policyholders’ risk profiles. Most of the analyses of insurance claim count distributions in the literature involve short-tailed count frequency distributions. In this paper, the inverse trinomial distribution was used to model insurance claim counts characterised by long tails and high over-dispersion. Two regression models were developed for the identification of associated risk factors and were demonstrated to significantly outperform the NB and COM-Poisson regression models. The results indicate the COM-Poisson model fails to fit the data, both with and without covariates, as evidenced by it having the highest chi-square, AIC, and BIC values. This highlights the reliability and flexibility of the IT model to handle count data with long tails and significant over-dispersion. The IT model would be useful in the construction of an improved bonus–malus system for auto insurance. Upon analysis of the ausprivautolong dataset of automobile claims in Australia, driving age was identified as a significant factor; as the claim frequency decreased, driver age increased. Additionally, lower-value vehicles were shown to be more prone to claims.

The IT model’s adaptability suggests it could be applied to other fields with highly over-dispersed count data such as healthcare, social sciences and public safety. For example, it could help analyse medical claim frequencies or accident counts, where the data often have long tails and high variability, improving risk assessment and policy planning.

Author Contributions

Conceptualization, S.H.O.; Methodology, S.H.O., S.Z.S. and S.L.; Validation, S.H.O. and S.L.; Formal analysis, S.H.O., S.Z.S. and S.L.; Investigation, S.H.O., S.Z.S. and S.L.; Resources, S.Z.S.; Data curation, S.H.O. and S.Z.S.; Writing—original draft, S.H.O., S.Z.S. and S.L.; Writing—review & editing, S.H.O., S.Z.S. and S.L.; Supervision, S.H.O.; Project administration, S.H.O. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data utilized in this study are publicly available from the R package CASdatasets.

Acknowledgments

The authors wish to thank the reviewers for their insightful comments, which have greatly improved this paper.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

The following is a sample of the code used for log-likelihood, optimization and Hessian approximation:

# Import the necessary libraries.
import numpy as np
import numdifftools as nd
from scipy.optimize import minimize, basinhopping

# Log-likelihood function.
def log_likelihood(params, X, y, N):
“““
Calculate the total log-likelihood function of the IT Regression I model.

Parameters:
-params: parameters [beta, p, r]
-X: design matrix (numpy array of shape (N, P-2))
-y: observed data (numpy array of length N)
-N: size of the dataset (number of observations)

Returns:
-float: The negative log-likelihood value to be minimized.
“““
P = X.shape [1] + 2 # Number of parameters, including p and r.
beta = params[:P-2] # Beta parameters.
p = params[P-2] # p parameter.
r = params[P-1] # r parameter.
q = 1.0 − p − r # q parameter (based on p and r).

large_negative = −1e10 # Penalty value for invalid cases.

LL = np.zeros(N)
for k in range(N):
linear_predictor = np.dot(X[k, :], beta)
Lam = (p − r) * np.exp(linear_predictor)/(1.0 − (p − r))

# Check if Lam, p and q are positive.
if Lam <= 0 or p <= 0 or q <= 0:
LL[k] = large_negative
continue

if y[k] == 0:
LL[k] = Lam * np.log(p)

elif y[k] == 1:
LL[k] = Lam * np.log(p) + np.log(Lam) + np.log(q)

else:
LL[k] = custom_log_likelihood_for_y_greater_than_1(y[k], Lam, p, r, q)

return-np.sum(LL)

def constraint_sum(params):
p = params[P-2]
r = params[P-1]
return 1.0 − (p + r)

def constraint_diff(params):
p = params[P-2]
r = params[P-1]
return p − r

constraints = [
{‘type’: ‘ineq’, ‘fun’: constraint_sum},
{‘type’: ‘ineq’, ‘fun’: constraint_diff},
]

# Initial guess for SLSQP optimization.
P = 12 # Total number of parameters (e.g., 12 for 10 beta parameters, p and r)
initial_guess = np.concatenate((np.full(P-2, 0.5), [0.5, 0.1])) # Initial guess for beta, p and r.

# Example bounds for beta, p, and r.
bounds = [(−5, 5) for _ in range(P-2)] + [(0.01, 0.999), (0.01, 0.499)] # Adjust bounds if necessary.

# Run SLSQP locally to obtain initial parameter estimates.
slsqp_result = minimize(
log_likelihood,
initial_guess,
args = (X, y, N),
method = ‘SLSQP’,
bounds = bounds,
constraints = constraints,
options = {‘disp’: True}
)

# Extract the estimated parameters from SLSQP.
initial_basin_hopping = slsqp_result.x

# Define local minimizer for basin-hopping (SLSQP).
minimizer_kwargs = {
‘method’: ‘SLSQP’,
‘args’: (X, y, N),
‘bounds’: bounds,
‘constraints’: constraints
}

# Perform basin-hopping using the initial SLSQP result.
result = basinhopping(
log_likelihood,
x0 = initial_basin_hopping,
minimizer_kwargs = minimizer_kwargs,
niter = 50,
T = 0.5,
stepsize = 0.2,
disp = True
)

def hessian_approximation(params):
“““
Approximate the Hessian matrix using numdifftools.

Parameters:
-params: parameters [beta, p, r]

Returns:
-hessian: Hessian matrix.
“““
try:
# Use numdifftools to compute the Hessian matrix.
hessian_func = nd.Hessian(lambda p: log_likelihood(p, X, y, N))
hessian = hessian_func(params)

# Regularization to ensure stability (optional, based on context).
regularization = 1e−6
hessian += regularization * np.eye(len(params))

# Validate Hessian.
if np.any(np.isnan(hessian)) or np.any(np.isinf(hessian)):
raise ValueError(“Hessian contains NaN or infinity values.”)
if np.any(np.diag(hessian) <= 0):
print(“Warning: Hessian has non-positive diagonal elements.”)

except Exception as e:
print(f”Error in Hessian computation: {e}”)
return np.full((len(params), len(params)), np.nan)

return hessian

References

Aoyama, K., Shimizu, K., & Ong, S. H. (2008). A first-passage time random walk distribution with five transition probabilities: A generalization of the shifted inverse trinomial. Annals of the Institute of Statistical Mathematics, 60, 1–20. [Google Scholar] [CrossRef]
Cheok, C. C. T., Khoo, W. C., & Khoo, H. L. (2024). Bus travel time variability modelling using Burr type XII regression: A case study of Klang Valley. KSCE Journal of Civil Engineering, 28, 3998–4009. [Google Scholar] [CrossRef]
Conway, R. W., & Maxwell, W. L. (1962). A queueing model with state dependent service rates. Journal of Industrial Engineering, 12, 132–136. [Google Scholar]
David, M. (2015). Auto insurance premium calculation using generalized linear models. Procedia Economics and Finance, 20, 147–156. [Google Scholar] [CrossRef]
Dean, C., Lawless, J. F., & Willmot, G. E. (1989). A mixed Poisson–Inverse-Gaussian regression model. Canadian Journal of Statistics, 17, 171–181. [Google Scholar] [CrossRef]
Denuit, M., Maréchal, X., Pitrebois, S., & Walhin, J. F. (2007). Modeling of claim counts. Risk classification, credibility and bonus-malus systems. Wiley. [Google Scholar]
Greenwood, M., & Yule, G. U. (1920). An Inquiry into the nature of frequency distributions of multiple happenings, with particular reference to the occurrence of multiple attacks of disease or repeated accidents. Journal of the Royal Statistical Society A, 83, 255–279. [Google Scholar] [CrossRef]
Lemaire, J. (1976). Driver Versus Company. Scandinavian Actuarial Journal, 1976, 209–219. [Google Scholar] [CrossRef]
Lemaire, J. (1985). Automobile insurance: Actuarial models. Kluwer-Nijhoff. [Google Scholar]
Letac, G., & Mora, M. (1990). Natural real exponential families with cubic variance functions. Annals of Statistics, 18, 1–37. [Google Scholar] [CrossRef]
Li, X., Lord, D., Zhang, Y., & Xie, Y. (2008). Predicting motor vehicle crashes using Support Vector Machine models. Accident Analysis and Prevention, 40(2008), 1611–1618. [Google Scholar] [CrossRef]
Lord, D. (2006). Modeling motor vehicle crashes using Poisson-gamma models: Examining the effects of low sample mean values and small sample size on the estimation of the fixed dispersion parameter. Accident Analysis & Prevention, 38(4), 751–766. [Google Scholar]
Lord, D., Washington, S. P., & Ivan, J. N. (2005). Poisson, Poisson-Gamma and zero inflated regression models of motor vehicle crashes: Balancing statistical fit and theory. Accident Analysis & Prevention, 37(1), 35–46. [Google Scholar]
Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H., & Teller, E. (1953). Equation of state calculations by fast computing machines. Journal of Chemical Physics, 21, 1087–1092. [Google Scholar] [CrossRef]
Morris, C. (1982). Natural exponential families with quadratic variance functions. Annals of Statistics, 10(1), 65–80. [Google Scholar] [CrossRef]
Ong, S. H., Liew, K. W., Shimizu, K., & Loh, Y. F. (2024). Mixed Poisson formulation of inverse trinomial distribution and other properties. [Submitted].
Phang, Y. N., Sim, S. Z., & Ong, S. H. (2013). Statistical analysis for the inverse trinomial distribution. Communications in Statistics—Simulation and Computation, 42, 2073–2085. [Google Scholar] [CrossRef]
Sellers, K. F., Borle, S., & Shmueli, G. (2012). The COM-Poisson model for count data: A survey of methods and applications. Applied Stochastic Models in Business and Industry, 28, 104–116. [Google Scholar] [CrossRef]
Shao, J., & Tu, D. (1995). The jackknife and bootstrap. Springer. [Google Scholar]
Shimizu, K., & Yanagimoto, T. (1991). The inverse trinomial distribution. Japanese Journal of Applied Statistics, 20(2), 89–96. (In Japanese). [Google Scholar] [CrossRef]
Shimizu, K., Nishii, N., & Minami, M. (1997). Multivariate inverse trinomial distribution as a Lagrangian probability model. Communications in Statistics—Theory and Methods, 26(7), 1585–1598. [Google Scholar] [CrossRef]
Sim, S. Z., & Ong, S. H. (2010). Parameter estimation for discrete distributions by generalized Hellinger-type divergence based on probability generating function. Communications in Statistics—Simulation and Computation, 39, 305–314. [Google Scholar] [CrossRef]
Sim, S. Z., Gupta, R. C., & Ong, S. H. (2018). Zero-inflated Conway-Maxwell Poisson distribution to analyze discrete data. International Journal of Biostatistics, 14(1), 20160070. [Google Scholar] [CrossRef]
Skinner, C. J., & de Toledo Vieira, M. (2007). Variance estimation in the analysis of clustered longitudinal survey data. Survey Methodology, 33(1), 3–12. [Google Scholar]
Wales, D. J., & Doye, J. P. (1997). Global Optimization by Basin-Hopping and the Lowest Energy Structures of Lennard-Jones Clusters Containing up to 110 Atoms. The Journal of Physical Chemistry A, 101(28), 5111–5116. [Google Scholar] [CrossRef]
Willmot, G. E. (1987). The Poisson-Inverse Gaussian distribution as an alternative to the Negative Binomial. Scandinavian Actuarial Journal, 1987, 113–127. [Google Scholar] [CrossRef]

Table 1. Fitting of automobile claim dataset in Australia, Period 1 (without covariates).

	Observed Frequency	Expected Frequency
	Observed Frequency	NB	COM-Poisson	IT
0	34,764	34,817.81	32,847.04	34,814.06
1	3704	3293.50	5952.58	3391.38
2	875	1078.85	1006.49	1008.93
3	301	437.18	163.42	401.10
4	129	194.13	25.78	183.67
5	87	90.73	4.69 ^	91.36
6	39	43.81		47.97
7	22	21.64		26.16
8	17	10.87		14.67
9	17	5.53		8.41
10	9	2.84		4.91
11	8	1.47		2.90
12	7	1.63 ^		1.74
13	6			1.05
14+	15			1.69 ^
$χ^{2}$		652.17	120,55.59	256.35
$\hat{θ}$		0.24
$\hat{φ}$		5.74
$\hat{υ}$			0.01
$\hat{λ}$			0.18	0.25
$\hat{p}$				0.57
$\hat{r}$				0.04
Log-likelihood		−21,073.78	−22,571.02	−20,990.19
AIC		42,151.56	45,146.05	41,986.34
BIC		42,152.69	45,145.63	41,988.50

^ Indicates the expected frequency is grouped. The index of dispersion = 3.11.

Table 2. Fitting of automobile claim dataset in Australia, Period 2 (without covariates).

	Observed Frequency	Expected Frequency
	Observed Frequency	NB	COM-Poisson	IT
0	34,346	34,408.04	32,263.66	34,401.50
1	3901	3468.01	6249.50	3583.95
2	1005	1178.66	1202.17	1102.91
3	343	494.52	230.32	451.40
4	155	227.18	44.00	212.54
5	75	109.80	8.39	108.66
6	56	54.82	1.97 ^	58.62
7	33	27.99		32.85
8	19	14.53		18.94
9	11	7.64		11.15
10	15	4.06		6.68
11	6	2.17		4.06
12	7	1.17		2.50
13	3	1.40 ^		1.55
14+	25			2.68 ^
$χ^{2}$		733.41	17,113.73	295.70
$\hat{θ}$		0.22
$\hat{φ}$		5.93
$\hat{υ}$			0.01
$\hat{λ}$			0.19	0.25
$\hat{p}$				0.55
$\hat{r}$				0.04
Log-likelihood		−22,530.84	−2,4361.47	−22,434.82
AIC		45,065.68	48,726.95	44,875.64
BIC		4,5066.96	48,726.84	44,877.77

^ Indicates the expected frequency is grouped. The index of dispersion = 3.40.

Table 3. Fitting of automobile claim dataset in Australia, Period 1 (with covariates).

Parameters/Coefficients	Estimate and Its Standard Error
Parameters/Coefficients	NB Regression	COM-Poisson Regression	IT Regression I	IT Regression II
Intercept	−1.36	−1.58	−1.42	−1.33
	(0.05) *	(0.03) *	(0.06) *	(0.08) *
DrivAgeyoung people	−0.11	−0.08	−0.02	−0.11
	(0.06)	(0.04)	(0.06)	(0.09)
DrivAgeworking people	−0.20	−0.16	−0.09	−0.19
	(0.06) *	(0.04) *	(0.06)	(0.09) *
DrivAgeolder work people	−0.22	−0.18	−0.11	−0.22
	(0.02) *	(0.04) *	(0.06) *	(0.09) *
DrivAgeold people	−0.39	−0.32	−0.20	−0.40
	(0.07) *	(0.04) *	(0.07) *	(0.10) *
DrivAgeoldest people	−0.40	−0.33	−0.22	−0.41
	(0.08) *	(0.05) *	(0.08)	(0.11) *
VehValue50–75 kAUD	0.15	0.12	0.08	0.15
	(0.04) *	(0.03) *	(0.04)	(0.06) *
VehValue 75–100 kAUD	0.18	0.14	0.06	0.18
	(0.12)	(0.07)	(0.12)	(0.17)
VehValue 100–125 kAUD	−0.90	−0.79	−0.62	−0.90
	(0.55)	(0.43)	(0.58)	(0.67)
VehValue >125 kAUD	−0.02	−0.003	−0.77	−0.04
	(0.67)	(0.41)	(1.00)	(0.93)
$\hat{φ}$	5.85	-	-	-
	(0.15) *
$\hat{υ}$	-	0.01	-	-
		(0.03)
$\hat{λ}$	-	-	-	0.07
				(0.002) *
$\hat{p}$	-	-	0.25	-
			(0.01) *
$\hat{r}$	-	-	0.01	0.01
			(0.003) *	(0.005) *
Log-likelihood	−21,036.95	−22,660.24	−16,050.81	−16,043.59
AIC	42,095.89	45,342.49	32,125.62	32,111.18
BIC	42,190.45	45,437.05	32,228.78	32,214.34

The bracketed figures indicate the standard errors of the parameter estimates, while an asterisk (*) denotes statistically significant estimates at a 5% level.

Table 4. Fitting of automobile claim dataset in Australia, Period 2 (with covariates).

Parameters/Coefficients	Estimate and Its Standard Error
Parameters/Coefficients	NB Regression	COM-Poisson Regression	IT Regression I	IT Regression II
Intercept	−1.22	−1.48	−1.23	−1.18
	(0.05) *	(0.03) *	(0.05) *	(0.08) *
DrivAge young people	−0.19	−0.15	−0.15	−0.21
	(0.06) *	(0.03) *	(0.06) *	(0.09) *
DrivAge working people	−0.23	−0.18	−0.17	−0.23
	(0.06) *	(0.03) *	(0.06) *	(0.09) *
DrivAge older work people	−0.26	−0.21	−0.19	−0.27
	(0.06) *	(0.03) *	(0.06) *	(0.09) *
DrivAge old people	−0.44	−0.35	−0.35	−0.45
	(0.07) *	(0.04) *	(0.07) *	(0.09) *
DrivAge oldest people	−0.32	−0.25	−0.25	−0.33
	(0.07) *	(0.04) *	(0.07) *	(0.10) *
VehValue 50–75 kAUD	0.19	0.15	0.11	0.20
	(0.04) *	(0.02) *	(0.04) *	(0.06) *
VehValue 75–100 kAUD	0.14	0.11	0.03	0.14
	(0.12)	(0.07)	(0.12)	(0.17)
VehValue 100–125 kAUD	−0.59	−0.45	−0.36	−0.61
	(0.47)	(0.33)	(0.50)	(0.61)
VehValue >125 kAUD	−0.62	−0.51	−0.83	−0.62
	(0.76)	(0.55)	(1.00)	(0.98)
$\hat{φ}$	5.67	-	-	-
	(0.14) *	-
$\hat{υ}$	-	0.01	-	-
		(0.03)
$\hat{λ}$	-	-	-	0.07
				(0.002) *
$\hat{p}$	-	-	0.23	-
			(0.01) *
$\hat{r}$	-	-	0.01	0.01
			(0.003) *	(0.005) *
Log-likelihood	−22,492.96	−24,289.21	−16,803.29	−16,803.16
AIC	45,007.92	48,600.42	33,630.58	33,630.31
BIC	45,104.62	48,697.13	33,736.08	33,735.81

The bracketed figures indicate the standard errors of the parameter estimates, while an asterisk (*) denotes statistically significant estimates at a 5% level.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ong, S.H.; Sim, S.Z.; Liu, S. The Modelling of Auto Insurance Claim-Frequency Counts by the Inverse Trinomial Distribution. J. Risk Financial Manag. 2025, 18, 7. https://doi.org/10.3390/jrfm18010007

AMA Style

Ong SH, Sim SZ, Liu S. The Modelling of Auto Insurance Claim-Frequency Counts by the Inverse Trinomial Distribution. Journal of Risk and Financial Management. 2025; 18(1):7. https://doi.org/10.3390/jrfm18010007

Chicago/Turabian Style

Ong, Seng Huat, Shin Zhu Sim, and Shuangzhe Liu. 2025. "The Modelling of Auto Insurance Claim-Frequency Counts by the Inverse Trinomial Distribution" Journal of Risk and Financial Management 18, no. 1: 7. https://doi.org/10.3390/jrfm18010007

APA Style

Ong, S. H., Sim, S. Z., & Liu, S. (2025). The Modelling of Auto Insurance Claim-Frequency Counts by the Inverse Trinomial Distribution. Journal of Risk and Financial Management, 18(1), 7. https://doi.org/10.3390/jrfm18010007

Article Menu

The Modelling of Auto Insurance Claim-Frequency Counts by the Inverse Trinomial Distribution

Abstract

1. Introduction

2. The Inverse Trinomial Distribution and Regression Models

2.1. Basic Properties

2.2. The Inverse Trinomial Regression Models

2.3. The Random Intercept Regression Model for Clustered Data

2.4. Parameter Estimation

3. Application to the Modelling of Auto Insurance Claim-Frequency Counts

3.1. Auto Insurance Claim-Frequency Count Dataset

3.2. Fit to the Auto Claim Count Data

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI