## Abstract

## 1. Introduction

## 2. Methodology

#### 2.1. Mixture-Based Clustering for the Ordered Stereotype Model

**Likelihood****functions:**- The (incomplete) likelihood of the data is$$\begin{array}{cc}\hfill L(\Omega ,\{{\tau}_{g}\}\mid \{{y}_{ij}\})& =\prod _{i=1}^{n}\left[\sum _{g=1}^{G}{\tau}_{g}\prod _{j=1}^{m}\prod _{k=1}^{q}{\left({\theta}_{gjk}\right)}^{I({y}_{ij}=k)}\right]\hfill \end{array}$$We define the unknown row group memberships through the following indicator latent variables,$${Z}_{ig}=I(i\in g)=\left\{\begin{array}{cc}1\hfill & ifi\in g\hfill \\ 0\hfill & ifi\notin g\hfill \end{array}\right.\phantom{\rule{15.0pt}{0ex}}i=1,\dots ,n,\phantom{\rule{15.0pt}{0ex}}g=1,\dots ,G$$$$({Z}_{i1},\dots ,{Z}_{ig})\sim \mathrm{Multinomial}(1;{\tau}_{1},\dots ,{\tau}_{G}),\phantom{\rule{15.0pt}{0ex}}i=1,\dots ,n.$$$${l}_{c}(\Omega ,\{{\tau}_{g}\}\mid \{{y}_{ij}\},\{{z}_{ig}\})=\sum _{i=1}^{n}\sum _{g=1}^{G}{z}_{ig}log\left({\tau}_{g}\right)+\sum _{i=1}^{n}\sum _{j=1}^{m}\sum _{k=1}^{q}\sum _{g=1}^{G}{z}_{ig}I({y}_{ij}=k)log\left({\theta}_{gjk}\right).$$
**Parameter****estimation:**- The parameter estimation for a fixed number of components G is performed using the maximum likelihood estimation approach fulfilled by means of the expectation-maximization (EM) algorithm proposed by Dempster et al. (1977) and used in most finite mixture problems discussed by McLachlan and Peel (2004).The EM algorithm consists of two steps: expectation (E-step) and maximization (M-step). As part of the E-step, a conditional expectation of the complete data log-likelihood function is obtained given the observed data and current parameter estimates. In the finite mixture model, the latent data corresponds to the component identifiers. As part of the E-step, the expectation taken with respect to the conditional posterior distribution of the latent data, given the observed data and the current parameter estimates, is referred to as the posterior probability that response ${y}_{ij}$ comes from the gth mixture component, computed at each iteration of the EM algorithm. The remaining part of the M-step requires finding component-specific parameter estimates $\Omega $ by solving numerically the maximum likelihood estimation problem for each of the different component distributions.The E-step and M-step alternate until the relative increase in the log-likelihood function is no bigger than a small pre-specified tolerance value, when the convergence of the EM algorithm is achieved. In order to find an optimal number of components, maximum likelihood estimation is obtained for each number of groups G, and the model is selected based on a chosen model selection criterion.In this model, the EM algorithm performs a fuzzy assignment of rows to clusters based on the posterior probabilities. The EM algorithm is initialized with an estimate $\{{\widehat{\Omega}}^{\left(0\right)},\{{\widehat{\tau}}_{g}^{\left(0\right)}\}\}$ of the parameters and proceeds by alternation of the E-step and M-step to estimate the missing data $\{{\widehat{Z}}_{ig}\}$ and to update the parameter estimates. In this section, we develop the E-step and M-step for row clustering. This development follows closely Fernández et al. (2016) (Section 3).
**E-Step:**- In the tth iteration of the EM algorithm, the E-Step evaluates the expected values ${\widehat{Z}}_{ig}$ of the unknown classifications ${Z}_{ig}$ conditional on the data $\{{y}_{ij}\}$ and the previous estimates of the parameters $\{{\widehat{\Omega}}^{(t-1)},\{{\widehat{\tau}}_{g}^{(t-1)}\}\}$. The conditional expectation of the complete data log-likelihood at iteration t is given by$$\begin{array}{cc}\hfill Q(\Omega ,\{{\tau}_{g}\}\mid {\widehat{\Omega}}^{(t-1)},\{{\widehat{\tau}}_{g}^{(t-1)}\})& ={E}_{\{{Z}_{ig}\}\mid \{{y}_{ij}\},{\Omega}^{(t-1)}}\left[{\ell}_{c}(\Omega ,\{{\tau}_{g}\}\mid \{{y}_{ij}\},\{{Z}_{ig}\})\right]\hfill \\ & =\sum _{i=1}^{n}\sum _{g=1}^{G}log({\widehat{\tau}}_{g}^{(t-1)})E\left[{z}_{ig}\mid \{{y}_{ij}\},{\widehat{\Omega}}^{(t-1)}\right]\hfill \\ \hfill \phantom{\rule{-56.9055pt}{0ex}}+\sum _{i=1}^{n}\sum _{j=1}^{m}\sum _{k=1}^{q}\sum _{g=1}^{G}& I({y}_{ij}=k)log\left({\theta}_{gjk}^{(t-1)}\right)E\left[{z}_{ig}\mid \{{y}_{ij}\},{\widehat{\Omega}}^{(t-1)}\right].\hfill \end{array}$$$$\begin{array}{cc}\hfill {\widehat{Z}}_{ig}^{\left(t\right)}=Pr\left[{Z}_{ig}=1\mid \{{y}_{ij}\},{\widehat{\Omega}}^{(t-1)}\right]& =\frac{Pr\left(\{{y}_{ij}\},{\widehat{\Omega}}^{(t-1)}\mid {z}_{ig}=1\right)Pr\left({z}_{ig}=1\right)}{{\sum}_{\ell =1}^{G}Pr\left(\{{y}_{ij}\},{\widehat{\Omega}}^{(t-1)}\mid {z}_{i\ell}=1\right)Pr\left({z}_{i\ell}=1\right)}\hfill \\ & =\frac{{\widehat{\tau}}_{g}^{(t-1)}{\prod}_{j=1}^{m}{\prod}_{k=1}^{q}{\left({\widehat{\theta}}_{gjk}^{(t-1)}\right)}^{I({y}_{ij}=k)}}{{\sum}_{\ell =1}^{G}\left\{{\widehat{\tau}}_{\ell}^{(t-1)}{\prod}_{j=1}^{m}{\prod}_{k=1}^{q}{\left({\widehat{\theta}}_{\ell jk}^{(t-1)}\right)}^{I({y}_{ij}=k)}\right\}}.\hfill \end{array}$$$$\begin{array}{c}\widehat{Q}(\Omega ,\{{\tau}_{g}\}\mid {\widehat{\Omega}}^{(t-1)},\{{\widehat{\tau}}_{g}^{(t-1)}\})=\sum _{i=1}^{n}\sum _{g=1}^{G}{\widehat{Z}}_{ig}^{\left(t\right)}log({\widehat{\tau}}_{g}^{(t-1)})\\ +\sum _{i=1}^{n}\sum _{j=1}^{m}\sum _{k=1}^{q}\sum _{g=1}^{G}{\widehat{Z}}_{ig}^{\left(t\right)}I({y}_{ij}=k)log\left({\widehat{\theta}}_{gjk}^{(t-1)}\right).\end{array}$$
**M-step:**- The M-step of the EM algorithm is the global maximization of the log-likelihood (4) obtained in the E-step, now conditional on the complete data $\{\{{y}_{ij}\},\{{\widehat{Z}}_{ig}\}\}$. For the case of finite mixture models, the updated estimations of the term containing the row-cluster proportions $\{{\tau}_{1},\dots {\tau}_{G}\}$ and the one containing the rest of the parameters $\Omega $ are computed independently. Thus, the M-step has two separate parts.The maximum-likelihood estimator for the parameter ${\tau}_{g}$ where the data ${Z}_{ig}$ are unobserved is$${\widehat{\tau}}_{g}^{\left(t\right)}=\frac{1}{n}\sum _{i=1}^{n}E\left[{Z}_{ig}\mid \{{y}_{ij}\},{\widehat{\Omega}}^{(t-1)}\right]=\frac{1}{n}\sum _{i=1}^{n}{\widehat{Z}}_{ig}^{\left(t\right)},\phantom{\rule{25.0pt}{0ex}}g=1,\dots ,G.$$$$\begin{array}{c}\hfill {\widehat{\Omega}}^{\left(t\right)}=\underset{\Omega}{\mathrm{argmax}}\left[\sum _{i=1}^{n}\sum _{j=1}^{m}\sum _{k=1}^{q}\sum _{g=1}^{G}{\widehat{Z}}_{ig}I({y}_{ij}=k)log\left({\theta}_{gjk}(\Omega )\right)\right]\end{array}$$

#### 2.2. The General Linear Cluster-Weighted Model

**Modeling****for**$f\left(y\right|\mathit{x},{\mathit{\vartheta}}_{g})$ and $f(\mathit{x},{\mathit{\theta}}_{g})$:- The CWM model is based on the assumption that $f\left(y\right|\mathit{x},{\mathit{\vartheta}}_{g})$ belongs to the exponential family of distributions that are strictly related to GLMs. The link function in Equation (5) relates the expected value $g\left({\mu}_{g}\right)={\beta}_{0g}+{\beta}_{1g}{x}_{1},\dots ,+{\beta}_{pg}{x}_{p}$. We are interested in estimation of the vector ${\mathit{\beta}}_{g}$, so the distribution of $y|\mathit{x},{\Im}_{g}$ is denoted by $f\left(y\right|\mathit{x},{\mathit{\beta}}_{g},{\lambda}_{g})$, where ${\lambda}_{g}$ denotes an additional parameter associated with a two-parameter exponential family.The marginal distribution $f(\mathit{x},{\theta}_{g})$ has the following components: $f(\mathit{v},{\theta}_{g}^{\prime})$ and $f(\mathit{w},{\theta}_{g}^{\u2033})$. The first component is modeled as p-variate Gaussian density with mean ${\mathit{\mu}}_{\mathit{g}}$ and covariance matrix ${\mathsf{\Sigma}}_{g}$ as $\varphi (\mathit{v},{\mathit{\mu}}_{\mathit{g}},{\mathsf{\Sigma}}_{g})$.The marginal density $f(\mathit{w},{\theta}_{g}^{\u2033})$ assumes that each finite discrete covariate W is represented as a vector ${\mathit{w}}^{r}={({w}^{r1},\dots ,{\mathit{w}}^{r{c}_{r}})}^{\prime}$, where ${w}^{rs}=1$ is ${w}_{r}$, which has the value s, s.t. $s\in \{1,\dots ,{c}_{r}\}$, and ${w}^{rs}=0$ otherwise.$$\begin{array}{c}\hfill f(\mathit{w},{\mathit{\gamma}}_{\mathit{g}})=\prod _{r=1}^{q}\prod _{s=1}^{{c}_{r}}{\left({\gamma}_{grs}\right)}^{{w}^{rs}},g=1,\dots ,G\end{array}$$$$\begin{array}{c}\hfill f(\mathit{x},y;\mathsf{\Phi})=\sum _{g=1}^{G}{\tau}_{g}f\left(y\right|\mathit{x};{\mathit{\beta}}_{g},{\lambda}_{g})\varphi (\mathit{v},{\mathit{\mu}}_{g},{\mathsf{\Sigma}}_{g})f(\mathit{w},{\gamma}_{g}).\end{array}$$
**Parameter****Estimation:**- The EM algorithm discussed in the previous section is used to estimate parameters of this model. Let ${({\mathit{x}}_{1}^{\prime},{y}_{1})}^{\prime},\dots ,{({\mathit{x}}_{n}^{\prime},{y}_{n})}^{\prime}$ be a sample of n independent pairs observations drawn from the model in Equation (9). For this sample, the complete data likelihood function, $L\left(\mathsf{\Phi}\right)$, is given by$$\begin{array}{c}\hfill {\mathsf{\u0141}}_{c}\left(\mathsf{\Phi}\right)=\prod _{i=1}^{n}\prod _{g=1}^{G}{\left[{\tau}_{g}f\left({y}_{i}\right|{\mathit{x}}_{i},{\mathit{\beta}}_{g},{\lambda}_{g})\varphi ({\mathit{v}}_{i},{\mathit{\mu}}_{g},{\mathsf{\Sigma}}_{g})f({\mathit{w}}_{i},{\mathit{\gamma}}_{g})\right]}^{{z}_{ig}}\end{array}$$By taking the logarithm of Equation (10), the complete data log-likelihood function, ${\ell}_{c}\left(\mathsf{\Phi}\right)$, is expressed as$$\begin{array}{c}\hfill {\ell}_{c}\left(\mathsf{\Phi}\right)=\sum _{i=1}^{n}\sum _{g=1}^{G}{z}_{ig}\left[log\left({\tau}_{g}\right)+logf\left({y}_{i}\right|{\mathit{x}}_{i},{\mathit{\beta}}_{g},{\lambda}_{g})+log\varphi ({\mathit{v}}_{i},{\mathit{\mu}}_{g},{\mathsf{\Sigma}}_{g})+logf({\mathit{w}}_{i},{\gamma}_{g})\right].\end{array}$$$$\begin{array}{c}\hfill Q(\mathsf{\Phi};{\mathsf{\Phi}}^{(t-1)})=\sum _{i=1}^{n}\sum _{g=1}^{G}{\tau}_{ig}^{(t-1)}\left[log\left({\tau}_{g}\right)+logf\left({y}_{i}\right|{\mathit{x}}_{i},{\mathit{\beta}}_{g},{\lambda}_{g})+log\varphi ({\mathit{v}}_{i},{\mathit{\mu}}_{g},{\mathsf{\Sigma}}_{g})+logf({\mathit{w}}_{i},{\gamma}_{g})\right].\end{array}$$
**E-step:**- The posterior probability that ${({{\mathit{x}}_{\mathit{i}}}^{\prime},{y}_{i})}^{\prime}$ comes from the g-th mixture component is calculated at the t-th iteration of the EM algorithm as$$\begin{array}{cc}\hfill {{\tau}_{ig}}^{(t)}& =E\left[{z}_{ig}\right|{({{\mathit{x}}_{\mathit{i}}}^{\prime},{y}_{i})}^{\prime},{\mathsf{\Phi}}^{(t)}]=\frac{{{\tau}_{g}}^{(t)}f({y}_{i}|{\mathit{x}}_{i},{{\mathit{\beta}}_{\mathit{g}}}^{(t)},{{\lambda}_{g}}^{(t)})\varphi ({\mathit{v}}_{i},{\mathit{\mu}}_{g}^{(t)},{{\mathsf{\Sigma}}_{\mathit{g}}}^{(t)})f({\mathit{w}}_{i},{{\gamma}_{g}}^{(t)})}{{\sum}_{{g}^{\prime}=1}^{G}f({y}_{i}|{\mathit{x}}_{i},{\mathit{\beta}}_{{g}^{\prime}}^{(t)},{\lambda}_{{g}^{\prime}}^{(t)})\varphi ({\mathit{v}}_{i},{\mathit{\mu}}_{{g}^{\prime}}^{(t)},{\mathsf{\Sigma}}_{{g}^{\prime}}^{(t)})f({\mathit{w}}_{i},{\gamma}_{{g}^{\prime}}^{(t)}){\tau}_{{g}^{\prime}}^{(t)}}.\hfill \end{array}$$
**M-step:**- The Q-function is maximized with respect to $\mathsf{\Phi}$, which is done separately for each term on the right hand side in Equation (9). As a result, the parameter estimates ${\widehat{\tau}}_{g}$, ${\widehat{{\mathit{\mu}}^{}}}_{g}$, ${\widehat{\mathsf{\Sigma}}}_{g}$, and ${\widehat{\mathit{\gamma}}}_{g}$, are obtained on the $(t+1)$-th iteration of the EM algorithm:$$\begin{array}{cc}\hfill {{\widehat{\tau}}_{g}}^{(t+1)}& =\frac{1}{n}\sum _{i=1}^{n}{\tau}_{ig}^{\left(t\right)}\hfill \\ \hfill {{\widehat{\mathit{\mu}}}_{g}}^{(t+1)}& =\frac{1}{{\sum}_{i=1}^{n}{\tau}_{ig}^{\left(t\right)}}\sum _{i=1}^{n}{\tau}_{ig}^{\left(t\right)}{\mathit{v}}_{i}\hfill \\ \hfill {{\widehat{\mathsf{\Sigma}}}_{g}}^{(t+1)}& =\frac{1}{{\sum}_{i=1}^{n}{\tau}_{ig}^{\left(t\right)}}\sum _{i=1}^{n}{\tau}_{ig}^{\left(t\right)}({\mathit{v}}_{i}-{\widehat{\mathit{\mu}}}_{g}^{(t+1)}){({\mathit{v}}_{i}-{\widehat{\mathit{\mu}}}_{g}^{(t+1)})}^{\prime}\hfill \\ \hfill {\widehat{\mathit{\gamma}}}_{gr}^{(t+1)}& =\frac{{\sum}_{i=1}^{n}{\tau}_{ig}^{\left(t\right)}{v}_{i}^{rs}}{{\sum}_{i=1}^{n}{\tau}_{ig}^{\left(t\right)}},\hfill \end{array}$$$$\begin{array}{c}\hfill \sum _{i=1}^{n}{\tau}_{ig}^{\left(t\right)}logf\left({y}_{i}\right|{\mathit{x}}_{i},{\mathit{\beta}}_{g},{\lambda}_{g}).\end{array}$$
**R**language (R Core Team 2016) in a similar framework as the mixture of generalized linear models are implemented. For additional details about this implementation, the reader is referred to Wedel and De Sabro (1995) and Wedel (2002).

#### 2.3. Model Selection Criterion

## 3. Application

#### 3.1. Data

#### 3.2. OSM Results

#### 3.3. CWM Results

**flexCWM**, developed by Mazza et al. (2017). The log-normal CWM was fitted to the following covariates: driver age, car age, density, and exposure. The model selection procedure based on the AIC and the BIC found three mixture components with their corresponding mixing probabilities as follows: $0.52$, $0.43$, and $0.05$. Table 4 shows the summary results for log-likelihood, AIC, and BIC. The CWM function selects the best model based on the minimum value of BIC. In our analysis, the best model is detected when $G=3$ and these results are shown in bold in Table 4. The number of selected components is consistent with the OSM approach.

## 4. Conclusions

## Appendix A. Model Fitting

**Table A1.**Results of fitting the OSM (1) for the French motor claims by policy data set.

Coefficient | Estimation | S.E. | 95% C.I. |
---|---|---|---|

${\widehat{\mu}}_{2}$ | 0.551 | 0.148 | (0.261, 0.841) |

${\widehat{\mu}}_{3}$ | −0.219 | 0.171 | (−0.554, 0.116) |

${\widehat{\mu}}_{4}$ | 2.533 | 0.224 | (2.094, 2.972) |

${\widehat{\mu}}_{5}$ | −1.702 | 0.160 | (−2.016, −1.388) |

${\widehat{\alpha}}_{1}$ | 1.096 | 0.210 | (0.684, 1.508) |

${\widehat{\alpha}}_{2}$ | 0.044 | 0.125 | (−0.201, 0.289) |

${\widehat{\beta}}_{1}$ | −2.188 | 0.143 | (−2.468, −1.908) |

${\widehat{\beta}}_{2}$ | −2.631 | 0.199 | (−3.021, −2.241) |

${\widehat{\beta}}_{3}$ | −0.002 | 0.190 | (−0.374, 0.370) |

${\widehat{\beta}}_{4}$ | 1.673 | 0.172 | (1.336, 2.010) |

${\widehat{\varphi}}_{2}$ | 3.636 | 0.209 | (3.226, 4.046) |

${\widehat{\varphi}}_{3}$ | 4.855 | 0.193 | (4.477, 5.233) |

${\widehat{\varphi}}_{4}$ | 4.990 | 0.154 | (4.688, 5.292) |

**Table A2.**Results for the CWM model. The significance of the p-values are shown with the corresponding level of significance as defined $\approx 0$ (***), 0.001 (**), 0.01 (*), and 0.05 (.) for each estimated coefficient.

Cluster 1 | ||||
---|---|---|---|---|

Coefficient | Estimation | S.E. | p-Value | |

Intercept | $7.3496$ | $0.2765$ | <2.2×10${}^{-16}$ | *** |

DriverAge2 | $-0.3275$ | $0.2137$ | $0.1634$ | |

DriverAge3 | $-0.2088$ | $0.1963$ | $0.2877$ | |

DriverAge4 | $-0.0451$ | $0.1925$ | $0.8146$ | |

DriverAge5 | $0.4828$ | $0.2812$ | $0.0863$ | . |

CarAge2 | $-0.1575$ | $0.2119$ | $0.4574$ | |

CarAge3 | $0.0086$ | $0.2001$ | $0.9653$ | |

CarAge4 | $-0.1845$ | $0.2088$ | $0.3770$ | |

CarAge5 | $-0.4929$ | $0.2939$ | $0.0938$ | . |

Density | $0.0004$ | $0.0001$ | 5.008×10${}^{-05}$ | *** |

Exposure | $-0.8287$ | $0.1401$ | 4.332×10${}^{-09}$ | *** |

Cluster 2 | ||||

Coefficient | Estimation | S.E. | p-Value | |

Intercept | $7.0694$ | $0.0088$ | <2.2×10${}^{-16}$ | *** |

DriverAge2 | $-0.0244$ | $0.0084$ | $0.0381$ | ** |

DriverAge3 | $-0.0157$ | $0.0066$ | $0.0177$ | * |

DriverAge4 | $-0.0095$ | $0.0074$ | $0.1412$ | |

DriverAge5 | $0.0008$ | $0.0078$ | $0.9186$ | |

CarAge2 | $0.0051$ | $0.7986$ | $0.4246$ | |

CarAge3 | $0.0118$ | $2.0162$ | $0.0439$ | * |

CarAge4 | $0.0101$ | $0.0060$ | $0.0970$ | . |

CarAge5 | $0.0113$ | $0.0077$ | $0.1440$ | |

Density | 3.2818×10${}^{-06}$ | 3.0435×10${}^{-06}$ | $0.2811$ | |

Exposure | $-0.0051$ | $0.0047$ | $0.2815$ | |

Cluster 3 | ||||

Coefficient | Estimation | S.E. | p-Value | |

Intercept | $3.3979$ | $0.2561$ | <2.2×10${}^{-16}$ | *** |

DriverAge2 | $1.2945$ | $0.1588$ | 8.84×10${}^{-16}$ | *** |

DriverAge3 | $1.2333$ | $0.1382$ | <2.2×10${}^{-16}$ | *** |

DriverAge4 | $1.1096$ | $0.1295$ | <2.2×10${}^{-16}$ | *** |

DriverAge5 | $2.6965$ | $0.2028$ | <2.2×10${}^{-16}$ | *** |

CarAge2 | $0.6748$ | $0.2105$ | $0.0013$ | ** |

CarAge3 | $1.9939$ | $0.18853$ | <2.2×10${}^{-16}$ | *** |

CarAge4 | $1.8501$ | $0.19410$ | <2.2×10${}^{-16}$ | *** |

CarAge5 | $2.7567$ | $0.26130$ | <2.2×10${}^{-16}$ | *** |

Density | 1.7878×10${}^{-04}$ | 4.3673×10${}^{-05}$ | 4.520×10${}^{-05}$ | *** |

Exposure | 6.2711×10${}^{-02}$ | $0.5266$ | $0.5985$ |

## Appendix B. Average Scores for Scatter Plots

**Figure 1.**Scatter plot depicting the clustering composition for $R=3$ (

**left**) claim clusters. Different color and shape symbols represent the clusters: Cluster 1 (square), Cluster 2 (circle), and Cluster 3 (triangle). The bar plot (

**right**) displays the profile of the claims in each cluster. The percentage represents the probability ${\theta}_{gjk}$ in each category (Equation (2)).

**Figure 2.**Spaced mosaic plot for the row clustering model $G=3$. The height of each block is proportional to the number of claims in each claim cluster; the width is proportional to the numbers of each ordinal response within each cluster. The area represents the frequency of each combination, also shown numerically in each block. The relative spacing between ordinal categories (e.g., 2.636 between 0 and 1, shown by the yellow, red, and green bars) has been determined by the data.

**Figure 3.**The bar plot (

**left**) displays the profile of the losses in each cluster G = 1:3, by driver age. The bar plot (

**right**) displays the profile of the losses in each cluster G = 1:3, by car age.

**Table 1.**Summary of the variables used in the cluster-weighted model (CWM) and the ordered stereotype model (OSM).

CWM | |

Variable Name | Description with Categorical Levels in Parenthesis |

Driver Age | <23 (1), [23, 27) (2), [27, 43) (3), [43, 75) (4), and [75+ (5) |

Car Age | <1 (1), [1, 5) (2), [5, 10) (3), [10, 15) (4), and 15+ (5) |

Density | continuous |

Exposure | continuous |

Losses | continuous |

OSM | |

Variable Name | Description with Ordinal Levels in Parenthesis |

Driver Age | <23 (5), [23, 27) (4), [27, 43) (3), [43, 75) (2), and [75+ (1) |

Car Age | <1 (1), [1, 5) (2), [5, 10) (3), [10, 15) (4), and 15+ (5) |

Exposure | <0.25 (1), [0.25, 0.50) (2), [0.50, 0.75) (3), [0.75, 1.00) (4), and >1.00+(5) |

Density | <40 (1), [40, 200) (2), [200, 500) (3), [500, 4500) (4), and 4500+ (5) |

Losses | <1000 (1), [1000, 2000) (2), [2000, 50,000) (3), [50,000, 100,000) (4), and 100,000+ (5) |

G | Loglik | AIC | BIC |
---|---|---|---|

1 | −12,155 | 24,453 | 24,599 |

2 | −12,081 | 24,188 | 24,276 |

3 | −11,777 | 23,584 | 23,685 |

4 | −12,773 | 25,580 | 25,695 |

5 | −12,851 | 25,641 | 25769 |

G | Loss | Driver Age | Exposure | Car Age | Density |
---|---|---|---|---|---|

1 | $3.22$ | $4.63$ | $4.52$ | $4.57$ | $3.57$ |

2 | $1.95$ | $4.81$ | $3.96$ | $4.38$ | $1.88$ |

3 | $1.78$ | $4.90$ | $3.20$ | $3.23$ | $1.94$ |

G | Loglik | AIC | BIC |
---|---|---|---|

1 | −12,495 | 25,025 | 25,112 |

2 | −11,956 | 23,229 | 23,394 |

3 | −11,064 | 22,222 | 22,464 |

4 | −10,801 | 22,200 | 22,519 |

