Risks

Editorial

Jump to: Research

2 pages, 258 KiB

Open AccessEditorial

Special Issue “Machine Learning in Insurance”

by Vali Asimit, Ioannis Kyriakou and Jens Perch Nielsen

Risks 2020, 8(2), 54; https://doi.org/10.3390/risks8020054 - 25 May 2020

Cited by 6 | Viewed by 3664

Abstract

It is our pleasure to prologue the special issue on “Machine Learning in Insurance”, which represents a compilation of ten high-quality articles discussing avant-garde developments or introducing new theoretical or practical advances in this field [...] Full article

(This article belongs to the Special Issue Machine Learning in Insurance)

Research

Jump to: Editorial

14 pages, 1114 KiB

Open AccessArticle

A Note on Combining Machine Learning with Statistical Modeling for Financial Data Analysis

by José María Sarabia, Faustino Prieto, Vanesa Jordá and Stefan Sperlich

Risks 2020, 8(2), 32; https://doi.org/10.3390/risks8020032 - 3 Apr 2020

Cited by 2 | Viewed by 3633

Abstract

This note revisits the ideas of the so-called semiparametric methods that we consider to be very useful when applying machine learning in insurance. To this aim, we first recall the main essence of semiparametrics like the mixing of global and local estimation and [...] Read more.

This note revisits the ideas of the so-called semiparametric methods that we consider to be very useful when applying machine learning in insurance. To this aim, we first recall the main essence of semiparametrics like the mixing of global and local estimation and the combining of explicit modeling with purely data adaptive inference. Then, we discuss stepwise approaches with different ways of integrating machine learning. Furthermore, for the modeling of prior knowledge, we introduce classes of distribution families for financial data. The proposed procedures are illustrated with data on stock returns for five companies of the Spanish value-weighted index IBEX35. Full article

(This article belongs to the Special Issue Machine Learning in Insurance)

► Show Figures

Figure 1

27 pages, 2013 KiB

Open AccessEditor’s ChoiceArticle

Prediction of Claims in Export Credit Finance: A Comparison of Four Machine Learning Techniques

by Mathias Bärtl and Simone Krummaker

Risks 2020, 8(1), 22; https://doi.org/10.3390/risks8010022 - 1 Mar 2020

Cited by 21 | Viewed by 11104

Abstract

This study evaluates four machine learning (ML) techniques (Decision Trees (DT), Random Forests (RF), Neural Networks (NN) and Probabilistic Neural Networks (PNN)) on their ability to accurately predict export credit insurance claims. Additionally, we compare the performance of the ML techniques against a [...] Read more.

This study evaluates four machine learning (ML) techniques (Decision Trees (DT), Random Forests (RF), Neural Networks (NN) and Probabilistic Neural Networks (PNN)) on their ability to accurately predict export credit insurance claims. Additionally, we compare the performance of the ML techniques against a simple benchmark (BM) heuristic. The analysis is based on the utilisation of a dataset provided by the Berne Union, which is the most comprehensive collection of export credit insurance data and has been used in only two scientific studies so far. All ML techniques performed relatively well in predicting whether or not claims would be incurred, and, with limitations, in predicting the order of magnitude of the claims. No satisfactory results were achieved predicting actual claim ratios. RF performed significantly better than DT, NN and PNN against all prediction tasks, and most reliably carried their validation performance forward to test performance. Full article

(This article belongs to the Special Issue Machine Learning in Insurance)

► Show Figures

Figure 1

79 pages, 1797 KiB

Open AccessEditor’s ChoiceArticle

Machine Learning in Least-Squares Monte Carlo Proxy Modeling of Life Insurance Companies

by Anne-Sophie Krah, Zoran Nikolić and Ralf Korn

Risks 2020, 8(1), 21; https://doi.org/10.3390/risks8010021 - 21 Feb 2020

Cited by 12 | Viewed by 7278

Abstract

Under the Solvency II regime, life insurance companies are asked to derive their solvency capital requirements from the full loss distributions over the coming year. Since the industry is currently far from being endowed with sufficient computational capacities to fully simulate these distributions, [...] Read more.

Under the Solvency II regime, life insurance companies are asked to derive their solvency capital requirements from the full loss distributions over the coming year. Since the industry is currently far from being endowed with sufficient computational capacities to fully simulate these distributions, the insurers have to rely on suitable approximation techniques such as the least-squares Monte Carlo (LSMC) method. The key idea of LSMC is to run only a few wisely selected simulations and to process their output further to obtain a risk-dependent proxy function of the loss. In this paper, we present and analyze various adaptive machine learning approaches that can take over the proxy modeling task. The studied approaches range from ordinary and generalized least-squares regression variants over generalized linear model (GLM) and generalized additive model (GAM) methods to multivariate adaptive regression splines (MARS) and kernel regression routines. We justify the combinability of their regression ingredients in a theoretical discourse. Further, we illustrate the approaches in slightly disguised real-world experiments and perform comprehensive out-of-sample tests. Full article

(This article belongs to the Special Issue Machine Learning in Insurance)

► Show Figures

Figure 1

13 pages, 406 KiB

Open AccessArticle

Modelling Unobserved Heterogeneity in Claim Counts Using Finite Mixture Models

by Lluís Bermúdez, Dimitris Karlis and Isabel Morillo

Risks 2020, 8(1), 10; https://doi.org/10.3390/risks8010010 - 29 Jan 2020

Cited by 9 | Viewed by 4013

Abstract

When modelling insurance claim count data, the actuary often observes overdispersion and an excess of zeros that may be caused by unobserved heterogeneity. A common approach to accounting for overdispersion is to consider models with some overdispersed distribution as opposed to Poisson models. [...] Read more.

When modelling insurance claim count data, the actuary often observes overdispersion and an excess of zeros that may be caused by unobserved heterogeneity. A common approach to accounting for overdispersion is to consider models with some overdispersed distribution as opposed to Poisson models. Zero-inflated, hurdle and compound frequency models are typically applied to insurance data to account for such a feature of the data. However, a natural way to deal with unobserved heterogeneity is to consider mixtures of a simpler models. In this paper, we consider k-finite mixtures of some typical regression models. This approach has interesting features: first, it allows for overdispersion and the zero-inflated model represents a special case, and second, it allows for an elegant interpretation based on the typical clustering application of finite mixture models. k-finite mixture models are applied to a car insurance claim dataset in order to analyse whether the problem of unobserved heterogeneity requires a richer structure for risk classification. Our results show that the data consist of two subpopulations for which the regression structure is different. Full article

(This article belongs to the Special Issue Machine Learning in Insurance)

► Show Figures

Figure 1

17 pages, 714 KiB

Open AccessArticle

In-Sample Hazard Forecasting Based on Survival Models with Operational Time

by Stephan M. Bischofberger

Risks 2020, 8(1), 3; https://doi.org/10.3390/risks8010003 - 3 Jan 2020

Cited by 4 | Viewed by 3378

Abstract

We introduce a generalization of the one-dimensional accelerated failure time model allowing the covariate effect to be any positive function of the covariate. This function and the baseline hazard rate are estimated nonparametrically via an iterative algorithm. In an application in non-life reserving, [...] Read more.

We introduce a generalization of the one-dimensional accelerated failure time model allowing the covariate effect to be any positive function of the covariate. This function and the baseline hazard rate are estimated nonparametrically via an iterative algorithm. In an application in non-life reserving, the survival time models the settlement delay of a claim and the covariate effect is often called operational time. The accident date of a claim serves as covariate. The estimated hazard rate is a nonparametric continuous-time alternative to chain-ladder development factors in reserving and is used to forecast outstanding liabilities. Hence, we provide an extension of the chain-ladder framework for claim numbers without the assumption of independence between settlement delay and accident date. Our proposed algorithm is an unsupervised learning approach to reserving that detects operational time in the data and adjusts for it in the estimation process. Advantages of the new estimation method are illustrated in a data set consisting of paid claims from a motor insurance business line on which we forecast the number of outstanding claims. Full article

(This article belongs to the Special Issue Machine Learning in Insurance)

► Show Figures

Figure 1

20 pages, 429 KiB

Open AccessArticle

A Likelihood Approach to Bornhuetter–Ferguson Analysis

by Valandis Elpidorou, Carolin Margraf, María Dolores Martínez-Miranda and Bent Nielsen

Risks 2019, 7(4), 119; https://doi.org/10.3390/risks7040119 - 10 Dec 2019

Cited by 3 | Viewed by 4814

Abstract

A new Bornhuetter–Ferguson method is suggested herein. This is a variant of the traditional chain ladder method. The actuary can adjust the relative ultimates using externally estimated relative ultimates. These correspond to linear constraints on the Poisson likelihood underpinning the chain ladder method. [...] Read more.

A new Bornhuetter–Ferguson method is suggested herein. This is a variant of the traditional chain ladder method. The actuary can adjust the relative ultimates using externally estimated relative ultimates. These correspond to linear constraints on the Poisson likelihood underpinning the chain ladder method. Adjusted cash flow estimates were obtained as constrained maximum likelihood estimates. The statistical derivation of the new method is provided in the generalised linear model framework. A related approach in the literature, combining unconstrained and constrained maximum likelihood estimates, is presented in the same framework and compared theoretically. A data illustration is described using a motor portfolio from a Greek insurer. Full article

(This article belongs to the Special Issue Machine Learning in Insurance)

► Show Figures

Figure 1

22 pages, 944 KiB

Open AccessArticle

Conditional Variance Forecasts for Long-Term Stock Returns

by Enno Mammen, Jens Perch Nielsen, Michael Scholz and Stefan Sperlich

Risks 2019, 7(4), 113; https://doi.org/10.3390/risks7040113 - 5 Nov 2019

Cited by 11 | Viewed by 4171

Abstract

In this paper, we apply machine learning to forecast the conditional variance of long-term stock returns measured in excess of different benchmarks, considering the short- and long-term interest rate, the earnings-by-price ratio, and the inflation rate. In particular, we apply in a two-step [...] Read more.

In this paper, we apply machine learning to forecast the conditional variance of long-term stock returns measured in excess of different benchmarks, considering the short- and long-term interest rate, the earnings-by-price ratio, and the inflation rate. In particular, we apply in a two-step procedure a fully nonparametric local-linear smoother and choose the set of covariates as well as the smoothing parameters via cross-validation. We find that volatility forecastability is much less important at longer horizons regardless of the chosen model and that the homoscedastic historical average of the squared return prediction errors gives an adequate approximation of the unobserved realised conditional variance for both the one-year and five-year horizon. Full article

(This article belongs to the Special Issue Machine Learning in Insurance)

► Show Figures

Figure 1

17 pages, 441 KiB

Open AccessArticle

On the Validation of Claims with Excess Zeros in Liability Insurance: A Comparative Study

by Marjan Qazvini

Risks 2019, 7(3), 71; https://doi.org/10.3390/risks7030071 - 30 Jun 2019

Cited by 5 | Viewed by 4471

Abstract

In this study, we consider the problem of zero claims in a liability insurance portfolio and compare the predictability of three models. We use French motor third party liability (MTPL) insurance data, which has been used for a pricing game, and show that [...] Read more.

In this study, we consider the problem of zero claims in a liability insurance portfolio and compare the predictability of three models. We use French motor third party liability (MTPL) insurance data, which has been used for a pricing game, and show that how the type of coverage and policyholders’ willingness to subscribe to insurance pricing, based on telematics data, affects their driving behaviour and hence their claims. Using our validation set, we then predict the number of zero claims. Our results show that although a zero-inflated Poisson (ZIP) model performs better than a Poisson regression, it can even be outperformed by logistic regression. Full article

(This article belongs to the Special Issue Machine Learning in Insurance)

► Show Figures

Figure 1

16 pages, 2386 KiB

Open AccessArticle

Predicting Motor Insurance Claims Using Telematics Data—XGBoost versus Logistic Regression

by Jessica Pesantez-Narvaez, Montserrat Guillen and Manuela Alcañiz

Risks 2019, 7(2), 70; https://doi.org/10.3390/risks7020070 - 20 Jun 2019

Cited by 129 | Viewed by 17617

Abstract

XGBoost is recognized as an algorithm with exceptional predictive capacity. Models for a binary response indicating the existence of accident claims versus no claims can be used to identify the determinants of traffic accidents. This study compared the relative performances of logistic regression [...] Read more.

XGBoost is recognized as an algorithm with exceptional predictive capacity. Models for a binary response indicating the existence of accident claims versus no claims can be used to identify the determinants of traffic accidents. This study compared the relative performances of logistic regression and XGBoost approaches for predicting the existence of accident claims using telematics data. The dataset contained information from an insurance company about the individuals’ driving patterns—including total annual distance driven and percentage of total distance driven in urban areas. Our findings showed that logistic regression is a suitable model given its interpretability and good predictive capacity. XGBoost requires numerous model-tuning procedures to match the predictive performance of the logistic regression model and greater effort as regards to interpretation. Full article

(This article belongs to the Special Issue Machine Learning in Insurance)

► Show Figures

Figure 1

18 pages, 418 KiB

Open AccessArticle

Sound Deposit Insurance Pricing Using a Machine Learning Approach

by Hirbod Assa, Mostafa Pouralizadeh and Abdolrahim Badamchizadeh

Risks 2019, 7(2), 45; https://doi.org/10.3390/risks7020045 - 19 Apr 2019

Cited by 4 | Viewed by 4355

Abstract

While the main conceptual issue related to deposit insurances is the moral hazard risk, the main technical issue is inaccurate calibration of the implied volatility. This issue can raise the risk of generating an arbitrage. In this paper, first, we discuss that by [...] Read more.

While the main conceptual issue related to deposit insurances is the moral hazard risk, the main technical issue is inaccurate calibration of the implied volatility. This issue can raise the risk of generating an arbitrage. In this paper, first, we discuss that by imposing the no-moral-hazard risk, the removal of arbitrage is equivalent to removing the static arbitrage. Then, we propose a simple quadratic model to parameterize implied volatility and remove the static arbitrage. The process of removing the static risk is as follows: Using a machine learning approach with a regularized cost function, we update the parameters in such a way that butterfly arbitrage is ruled out and also implementing a calibration method, we make some conditions on the parameters of each time slice to rule out calendar spread arbitrage. Therefore, eliminating the effects of both butterfly and calendar spread arbitrage make the implied volatility surface free of static arbitrage. Full article

(This article belongs to the Special Issue Machine Learning in Insurance)

► Show Figures

Figure 1

Journal Menu

Journal Browser

Machine Learning in Insurance

Share This Special Issue

Special Issue Editors

Special Issue Information

Keywords

Benefits of Publishing in a Special Issue

Published Papers (11 papers)

Editorial

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI