Big Data and Actuarial Science

: This article investigates the impact of big data on the actuarial sector. The growing ﬁelds of applications of data analytics and data mining raise the ability for insurance companies to conduct more accurate policy pricing by incorporating a broader variety of data due to increased data availability. The analyzed areas of this paper span from automobile insurance policy pricing, mortality and healthcare modeling to estimation of harvest-, climate- and cyber risk as well as assessment of catastrophe risk such as storms, hurricanes, tornadoes, geomagnetic events, earthquakes, ﬂoods, and ﬁres. We evaluate the current use of big data in these contexts and how the utilization of data analytics and data mining contribute to the prediction capabilities and accuracy of policy premium pricing of insurance companies. We ﬁnd a high penetration of insurance policy pricing in almost all actuarial ﬁelds except in the modeling and pricing of cyber security risk due to lack of data in this area and prevailing data asymmetries, for which we identify the application of artiﬁcial intelligence, in particular machine learning techniques, as a possible solution to improve policy pricing accuracy and results.


Introduction
Actuarial science is a far-reaching discipline in numerous contexts. In 2018, insurance sectors worldwide amassed a revenue exceeding $5 trillion [1]. A total that exceeds the entire GDP of the world's third-largest economy, Japan [2].
The insurance sector depends upon risk analysis and functions in a domain of improbability. As humans, we are equipped with intuitive reasoning to recognize risk and comprehend it. Incorporated into the progress of the Western world, the concept of risk motivates all elements of our daily lives, with safeguarding our own lifespan being the backdrop for doing so. This made for uncertainty that entered the minds of the first city-state inhabitants as they shifted their lifestyles to suit crude forms of financial living within highly connected groups. The adjustment from haggling to systems of exchange meant that risk became collectively measurable in respect of a common currency: A commercial innovation that had a knock-on effect on our ability to judge risk on multiple levels.
Certain qualities of insurance were in place before recognized formalities made up the origins of business and trade, historically identifiable by 400 BC. The merchants of Ancient Greece were able to obtain loans for ocean voyages with the stipulation that, if the shipment became lost at sea, no repayments would be sought. Plus, interest was applied to the loan to account for the insurance risk, which meant that the risk pool-an acknowledgement that a cohort and the recipient are involved with shared risk-taking-was present even during these early origins [3]. A premium was reached guide underwriting strategies have been a norm for the insurance sector, as well as its utilization for risk analysis, claim management, fraud, and pricing structures (Shapiro and Jain (2003) [14]).
Although insurers have a significant heritage of establishing and leveraging data-led methods, doing so has usually been domain-specific. There is the potential to be more general, however, as more extensive data sources continue to become available which, in turn, enhances the scope of analytics. Big data breakthroughs, combined with innovations in related sectors, like cybersecurity, blockchain, the Internet of Things (IoT), cloud computing (KPMG (2016) [15]; Corlosquet-Habart and Jansen (2018) [16]; EIOPA (2019) [17]), virtual reality, and quantum computing are all likely to define the direction and success of future economies, against which key developments in actuarial science will be assessed. In such a new outlook, insurance is only one sector looking to make the most of potential, with the activities of numerous additional players also impacting on how the sector will operate. Likely examples will include data providers and aggregators, retailers, automotive companies and telecommunications, with the progress of these industries poised to shape the insurance market that relies upon their custom. (BearingPoint (2020) [18]) Alongside countless opportunities, however, all of these sectors will produce new risk management complications. One example already apparent is the new trend of AI-powered real-time claims processing, which has opened the door to new forms of fraud, leading to the development of new types of safety nets like real-time detection. The more extensive personalization capabilities enabled by these new techniques, including for price and cover, bring forward numerous pricing dilemmas and the question of socio-economic consequences that, if becoming of political concern, might lead to increased regulation regarding how big data and AI algorithms are applied (Deloitte Insights (2019) [19]).
The data that insurance companies utilize at the various levels of the insurance value chain might incorporate both personal and non-personal information. In doing so, the data might be from internal or external sources or be both structured and unstructured [20]. Every insurance firm faces the challenge of assessing the likely contributions of a wide range of factors, including its own data, whether historic, current or potential, and also of external open data sources they may have access to and any purchased data, as well as keeping track of any knock-on effects coming from decisions and upgrades made within current insurance packages (Barthelé (2018) [21]).
Structured data are the most regularly applied for insurance analytics, although they make up only 10-15 percent of currently available data. Largely, it is sourced from the tabular data of spreadsheets or databases dependent upon them [22]. Plus, as the insurance sector has continued to advance over recent years in obtaining and examining extensive structured data connected to various products and customer bases, it has become clear that even greater unstructured data sources are still unexplored [23]. Examples of such unstructured sources include social media content, web feeds, emails, forums, blogs, text documents, and the plethora of streaming data that devices and sensors now capture. Further fresh sources of data made possible by digitization include: the IoT, such as vehicle telematics; wearables data, gathered from physical activities and medical patients especially; online media data, such as from online purchasing, job searches, web searches, and social media; geocoding data, which we are now often dependent upon for travel; genetics data, coming from gene and chromosome analysis; banking habits, in terms of shopping income and wealth, together with a wide range of interrelated data streams and unique data generated within insurance firms themselves (EIOPA, 2019).
The analytical capabilities of big data enabled the insurance market to gain a perspective of the wider marketplace before upgrading their offers by offering a slicker and more in-depth appreciation of risk exposure. When internal and external banks of data are merged with existing expertise, together with real-time monitoring techniques of the most crucial impactful factors, insurers can make a calculation for competitive premium rates and support risk assurance in real-time. The fresh sources of data now made available thanks to digitalization are proving to be a watershed moment for the insurance sector by allowing firms to correspond with clients in a manner that enhances brand loyalty. They can also make crucial business procedures, such as claims processing, run more smoothly and support the development of preventive practices to enhance cost-effectiveness and profit.
Significant trends are becoming clear in terms of how the insurance sector is looking to embrace big data. A 2018 analysis from SNS Telecom & IT (2018) [23] confirms that the utilization of big data continues to accelerate, forecasting a predicted $3.6 billion worth of investments to take place by 2021. Furthermore, the same research also shows figures highlighting how the embracing of big data leads to 30 percent healthier access to insurance packages, 40-70 percent cost savings, and 60 percent improvement on identifying fraud, so it is proving beneficial to companies and to their investors. Furthermore, another study [24] estimated that 74 percent of insurance firms (figures taken from a batch of around 4 percent of worldwide companies) confirmed that the utilization of big data and analytics has resulted in competitive gains for their businesses. While another study [25] found that more than half of respondents reported that finding the right workforce expertise to take advantage of big data is a challenge, and less than ten percent felt that they were currently making the most of the available potential.
Breakthroughs in Big Data Analytics and wearables are resulting in a growing intrigue for the prospects of insurance premiums and products that are 'personalized' (Topol (2015) [26]; Swiss Re (2016) [27]; Meyers et al. (2017) [28]). The key players championing big data, who are multinational giants, are making steady progress towards launching their presence into the insurance industry via their extensive capabilities for customer experience [29]. A crucial lead these Big Tech firms have over those operating in the insurance market is the convenient access to a plethora of customer data, especially in relation to personal and behavioral tendencies. Using this store of information, they are constructing ecosystems capable of making services bespoke to each client, as well as launching new value-added features to enhance the customer experience even further. These significant new insurance market players are also leading the way in pioneering many related skills, via the likes of artificial intelligence and machine learning, as well as data analytics. Worldwide, close to one-third of customers are welcoming the opportunity to purchase insurance from Big Tech companies, as shown by the 2018 World Insurance Report [30]. Indeed, the same report also shows that Gen Y and tech-savvy companies are strongly focused on choosing new insurance providers within a short-term period, which will include more utilization of Big Tech players. Plus, over 65 percent of surveyed customers confirmed that overall personalization of the customer experience is now their key focus.
Actuaries have a firm foundation in traditional forms of statistics and their usage in the overall assessment of insurance and related financial concerns (SOA (2019) [31]). Despite this, a substantial increase has been identified in the application of predictive analytics due to their versatility in addressing a wide range of new and varied challenges. [32] Current research expects significant transformations concerning the analytical methods employed by actuaries and related professionals (Sondergeld and Purushotham (2019) [33]; Guo (2003) [34]; Wedel and Kannan (2016) [35]). Actuaries also possess in-depth expertise in relation to how insurance and financial services workplaces function.
To enable a pathway for actuaries to access and flourish in an environment defined by big data, an extensive list of abilities and knowledge need to be incorporated, including data integration and preparation, the constructing of specialized computing environments, data mining and intelligent algorithms (Deloitte Insights (2019) [36]). Even more crucially, data analysis has to be made applicable to the more general aspects of business strategy, so data researchers need to comprehend the challenges to business that need answering and the context in which doing so benefits the company. To achieve this, data scientists require a blend of business, technical and people skills [37].
Due to the relations with computer science, the launch of big data also stretches to electrical engineering and computer science (EECA), as well as the application of platforms [38,39] in respect of data handling techniques. This breakthrough has resulted in significant requirements for higher education institutions to merge EECA with statistics and actuarial science. In such a manner, the business sector can progress alongside engineering innovation as they pursue a vision of applying actuarial science to a new industry landscape. A comparable transformation is underway for financial technology (Fintech), which is also dependent on aspects of actuarial science. Fintech educational programs that cover both the business and engineering sectors are clear examples.
Undoubtedly, requirements exist for providing students with mentors from within the profession they are pursuing. The task for actuaries, therefore, will be to grow these abilities themselves, or become proficient with handling and applying a wide range of new data science tools and in numerous multidisciplinary contexts. The priority is for actuaries to reach an appreciation for the benefits that big data offers via some innovative data handling methods and especially with regards to unstructured data.
Because actuarial science and insurance make up a worldwide industry, which draws in double the intake of the oil industry, the focused utilization of big data is worth analyzing. The expected result of big data breakthroughs and innovative technological advances for insurance companies could be enhancements in productivity methods, together with greater commercial profits. Indeed, a great deal of potential exists for insurers and also for their customers, but there are also ethical issues that big data analytics will inevitably raise, in terms of the transparency of new services, together with the associated accuracy and auditability. Introducing artificial intelligence and machine learning tools will not be achieved without complications (EIOPA, 2019). Figure 1 exhibits the typical process of a machine learning algorithm. After definition of a problem, a dataset needs to be generated by gathering the relevant data. Once collected, these data serve as a training set subject to a loss function which serves as input for the learning algorithm, which is subject to a pre-set hypothesis. By optimizing the prediction error, the optimal parameters are being saved and fed into the learned model which then performs the prediction analysis. the benefits that big data offers via some innovative data handling methods and especially with regards to unstructured data. Because actuarial science and insurance make up a worldwide industry, which draws in double the intake of the oil industry, the focused utilization of big data is worth analyzing. The expected result of big data breakthroughs and innovative technological advances for insurance companies could be enhancements in productivity methods, together with greater commercial profits. Indeed, a great deal of potential exists for insurers and also for their customers, but there are also ethical issues that big data analytics will inevitably raise, in terms of the transparency of new services, together with the associated accuracy and auditability. Introducing artificial intelligence and machine learning tools will not be achieved without complications (EIOPA, 2019). Figure 1 exhibits the typical process of a machine learning algorithm. After definition of a problem, a dataset needs to be generated by gathering the relevant data. Once collected, these data serve as a training set subject to a loss function which serves as input for the learning algorithm, which is subject to a pre-set hypothesis. By optimizing the prediction error, the optimal parameters are being saved and fed into the learned model which then performs the prediction analysis.

Automobile Insurance
Actuaries are becoming armed and ready with new tools and techniques, blending with their own working procedures and offering them relevance within a new world led by big data thanks to being able to identify previously unseen patterns unavailable from previously utilized databanks. Kunce and Chatterjee (2017) [40] highlighted how K-Nearest Neighbors, K-Means Clustering, and Kernel Regression can be applied to assess rating aspects of Commercial Auto Liability policies from a decade of archived closed-claims data (2003-2012). Noll et al. (2018) [41] offer guidance based on a comparison of a generalized linear model for claims frequency modeling with boosting machines, regression trees, and neural network approaches to French motor third-party liability insurance data. Similarly, Zappa et al. (2019) [42] examine the effectiveness of text mining in grasping latent information applicable for directing insurance costs and current NLP results, in the content of accident reports filed from 2005-2007 in the United States.
Innovative service and business methods are consistently becoming available thanks to IoT breakthroughs, many of which insurers are immediately drawn to. This IoT expertise allows insurers to attempt more accurate calculations when assessing risk. As detailed by a 2018 European Insurance and Occupational Pensions Authority (EIOPA) survey [43], almost a quarter of respondents to the survey already utilize IoT capabilities, with around one-third predicted to take such a step by 2021.

Automobile Insurance
Actuaries are becoming armed and ready with new tools and techniques, blending with their own working procedures and offering them relevance within a new world led by big data thanks to being able to identify previously unseen patterns unavailable from previously utilized databanks. Kunce and Chatterjee (2017) [40] highlighted how K-Nearest Neighbors, K-Means Clustering, and Kernel Regression can be applied to assess rating aspects of Commercial Auto Liability policies from a decade of archived closed-claims data Innovative service and business methods are consistently becoming available thanks to IoT breakthroughs, many of which insurers are immediately drawn to. This IoT expertise allows insurers to attempt more accurate calculations when assessing risk. As detailed by a 2018 European Insurance and Occupational Pensions Authority (EIOPA) survey [43], almost a quarter of respondents to the survey already utilize IoT capabilities, with around one-third predicted to take such a step by 2021. Figures show that the motor insurance industry especially tends to be swifter in taking up new data potential.
In the past, auto insurers have tended to depend upon indirect indicators of driving performance to confirm their premiums, such as age, address, and credit history. Moving forward, however, there will be more of a tendency to make insurance usage-based and featuring real-time pricing, which is largely made possible by auto insurers making the most of telematics capabilities to draw in a mass of data regarding driving performance and behavior (CB Insights (2019) [44]). Regarding telematics, which can be gathered via smartphone apps as well as black boxes fitted in vehicles, predictive analytics enables insurance firms to obtain more precise readings on poor driving examples and law-breaking (Bellina et al. (2018) [45]). Predictive techniques, therefore, may then be applied to risk assessments so that clients can receive tailored benefits that reward quality driving (Huang and Meng (2019) [46]).
Tselentis et al. (2017) [47] offer an analysis of the most common and well received methods concerning usage-based motor insurance (UBI). Baecke and Bocca (2017) [48] examine the efficacy of telematics driving behavior data to enhance insurance risk assessments by applying a distinctive dataset from a European insurance business. Doing so shows that incorporating typical telematics variables greatly enhances the ability to analyze a client's particular risk. Consequently, insurers will have the capacity to construct bespoke products based on a risk assessment of the client, even using only three months of data to make such a decision. Paefgen et al. (2013 [49], 2014 [50]) initially looked to test machine learning capabilities in the context of Usage Based Insurance (UBI) pricing. Huang and Meng (2019) [46] went further by examining how applying both typical and new telematics data to assess the risk probability and claim likelihood of vehicles in use. Machine learning methods are strengthened by data-binning procedures, which enhances the precision and interpretability available for establishing vehicle insurance costs. In addition, Husnjak et al. (2015) [51] provide an analysis of the system architecture of current telematics methods found in the market, together with the data structure applied to motor insurance billing procedures. A subsequent real case project launched in Eastern Europe attests to the capabilities of this approach.
Upgrading the autonomous vehicle technology may result in the most transformative time for automobile insurers since it was founded (KPMG, 2015). This shift will certainly transform the way we operate vehicles and travel but there will also be indirect effects on insurance markets, due to fresh perspectives on the risks involved and inevitable pressures to lower premiums as vehicles become safer (Richman (2018) [52]). KPMG's (2017) [53] research explores the effects autonomous technology is likely to have on vehicle insurance, especially in the context of causing disorder in one of the key roles that actuaries play.
Enhancing the impact of advertising drives and the pursuit and eradication of fraud offer additional examples showcasing some further advantages big data can provide.
Alshamsi (2014) [54] applied random forest algorithms that assist insurers in anticipating client decisions to achieve more enticing insurance packages. Wang (2020) [55] examined the data from over 60,000 auto insurance packages, employing a Light-GBM algorithm to identify the vital features that impact upon decisions to remain with a particular insurer. Doing so allows businesses to produce more robust advertising tactics.
Fraud within the automobile insurance sector makes up a critical portion of the costs faced by property insurers, leading to pricing consequences and socio-economic impacts. Wang and Xu (2018) [56] suggest a deep learning method for identifying fraud within the insurance market via Latent Dirichlet Allocation (LDA)-based text analytics. This approach utilizes LDA to expose aspects Big Data Cogn. Comput. 2020, 4, 40 8 of 29 of text found within accident accounts and showing up in claims before deep neural networks are then focused on the results, which incorporate the text aspects and typical numeric attributes applicable to recognizing fraud. Subudhi and Panigrahi (2018) [57] examined the identification of fraud within the auto insurance market via a data balancing approach called Adaptive Synthetic Sampling Approach for Imbalanced Learning, then adding the three supervised classifiers of Support Vector Machine (SVM), Decision Tree (DT) and Multi-Layered Perceptron (MLP).

Mortality Modelling
Current evidence-based underwriting should take on board a significant scale of vital and changeable health and durability factors, including demographic aspects, lifestyle trends, healthcare innovations, and their interconnectivity. Most examples of interventions for public health are done so with a focus on the overall well-being of the populace. Actuarial and medical research, on the other hand, usually pursue different priorities. Although mortality is of primary interest to an actuary, the domain of worsening medical conditions is the focus of medical researchers. Furthermore, clinical experiments handle patients fitting specific criteria, and typically over a short time period. This clarifies that the motivations for current medical research, spanning a wide range of systematic studies, while of interest, are not solely enough for actuarial fields, meaning that the close participation of actuarial experts in health-data modeling is necessary. Insightful actuarial longevity studies need to focus on statistical modeling for population-based individual-level data obtained over an extensive time period (IFoA (2016) [58]).
Mortality modelling can prove integral for actuarial science for being applied to the organizing of risks to longevity via actuarial pricing of securities that are mortality-based, together with any other products linked to lifespan (LLMA (2010) [59]; Silverman and Simpson (2011) [60]).
Typically, approaches that are regression-focused were applied to assess mortality issues, together with extrapolative fittings (Booth and Tickle (2008) [61]). Pioneering methods, including machine learning, predictive analytics, and deep learning techniques, are being consistently embraced throughout actuarial science. Deprez et al. (2017) [62] apply regression trees to demonstrate ways that mortality modelling can be enhanced by incorporating individual features, as well as to calculate conditional likelihoods that can also be linked to mortality. The findings are taken from cases of Swiss mortality, via the Human Mortality Database. Kopinsky (2017) [63] applies tree models to summarize and anticipate maternity recovery and mortality statistics. The data this research draws upon include somewhere between 500,000 and 3,000,000 cases, obtained via a Group Long-Term Disability Database. Hainaut (2018) [64] puts forward the application of a neural network to anticipate and simulate rates of human mortality. This neural analyzer has been converted to fit mortality rates coming from the US, the UK, and France from 1946-2000. The research ascertained that this neural network method offers a significant predictive potential, much greater than the frequently used Lee-Carter model (1992). Additionally, Richman (2018) [52] provides an analysis of deep learning capabilities in respect of actuarial issues in mortality predictions. Shang (2017) [65] anticipates mortality statistics according to types of cancer in a particular populace, incorporating predictive methods that include regression, classification, K-nearest neighbors, random forest, regression tree, and neural network.

Healthcare
Healthcare sectors are experiencing wide-scale disruption due to the pressures of sustainability, resulting from ageing populations and spiraling costs. As healthcare structures come under more and more pressure, therefore, enhancing the effectiveness of existing practices will become essential if our societies are to manage healthcare needs to meet current expectations. In addressing this significant challenge, big data will play a vital role, being well placed to inform innovative research-which is undoubtedly underused in the current climate-as well as to bring about efficiency improvements and to support the industry in being more cost-effective overall. Mehta et al. (2019) [66] offer a mapping approach that is systematic, allowing for the classification and examination of big data analytics and artificial intelligence innovations that may allow healthcare to advance. Their study includes an analysis of more than 2000 articles obtained from 2013-2019. One means via which Big Data can impact on healthcare advances is through data-driven research that can be applied to enhance personalization techniques, together with their relevance to patient-centered results, and the knock-on benefits for re-admission stats (Chawla and Davis (2013)) [67].
Actuaries operating within healthcare sectors are able to connect with many big data sources that might offer further expertise on risks and potential, previously unobtainable. These new databanks, together with claims and demographic data, incorporate data obtained from social media, from personal devices such as fitness apps, and from medical equipment like diagnostic technologies. Sources for producing such data can include insurance customers, healthcare patients and services, together with various forms of sensor and digitized equipment (Raghupathi and Raghupathi (2014)) [68]. There include examples that were not previously applicable, including the ability to map a patient's genome.
Above all, personal devices allow for those who maintain them to draw in each user's activity data, much of which is applicable to health. Such technology means that extensive lifestyle data are now available, with physical activity and heart rates among those factors that can be read. This will undoubtedly result in more effective predictive and diagnostic analytics (OECD (2020) [69]).
The most significant healthcare benefits from the obtainment and application of big data involve the ability to examine the individual attributes of clients alongside the price and effectiveness of their subsequent care needs, which will lead to the best available treatments and their costs. Furthermore, pre-emptive approaches can be combined with predictive methods, backed up by extensive input variables. A further potential that big data offers providers is the sharing of key information across numerous institutions, which will enhance their reaction times and understanding of a particular patient's needs. Raghupathi and Raghupathi (2014) discuss numerous other advantages of big data in the context of public health, medical breakthroughs, and genomic analytics, the remote monitoring of devices, individual profile analytics, and pre-adjudication fraud assessment. This is furthered by Belle et al. (2015) [69] who examine three key study areas that have already embraced big data. The first of these is image processing, as images have become crucial for informing diagnosis and for agreeing on therapeutic approaches. The second is signal processing, which uses medical signals to assess barriers that are volume-and velocity-based, particularly throughout ongoing, high-resolution acquisition and storage via monitors linked with the patient. The third is genomics, which is benefitting from high-throughput sequencing methods thanks to reducing the costs involved with sequencing the human genome (Lander et  However, certain cautions are included before treating big data as without issue. Feldman (2012) [73], for example, reports on cases of providers handling big data as if it is perfect and without error, but healthcare is actually the sector that is most likely to find statistics that are inconstant, unstructured, or possibly even just false. There are also numerous service aspects that data do not yet incorporate, such as handwritten prescriptions.
Big data is not the be-all and end-all, therefore, as was shown by the notable Google Flu Trends case (GFT). This was when the search provider attempted to apply search trend activities to estimate influenza outbreaks. This attempt has since been shown to be littered with errors and falsehoods (Kandula and Shaman (2019) [74]).
The utilization of big data still comes with a great deal of complications as mathematical interpretations develop to keep up with what can be learned from databanks. Ascertaining what is likely to occur within biological symptoms is not yet possible with anything close to total accuracy. To address this task, research that is multi-layered in scope is required, whether for medical imaging and sensing technologies, so that professionals can use data for direction regarding anatomical and Typically, health actuaries have applied standard claims data for decisions relating to premiums and related costs. Currently, however, the application of advanced visualization methods is accelerating and also incorporating machine learning approaches so that extensive databanks can be interpreted more effectively ( [92] assessed the predictive precision of extreme gradient boosting by comparing it with logistic regression via an assessment of claims that the Health Care Cost Institute found to be associated with particularly high costs. Boodhun and Jayabalan (2018) [93] took machine learning algorithms, including random tree, linear regression, and neural network, for anticipating the risks attached to each potential client. This was conducted with the use of a Prudential Life Insurance databank containing almost 60,000 applications, and with clients characterized by 128 different features.
Establishing bespoke medical packages for a multitude of patients demands extensive data that require standardization, as well as being analyzed for precision and prejudice, safely stored, speedily processed, and made engaging so that professionals can be clear on the advice and direction it offers. Nevertheless, there is significant interest around the world in personalized forms of healthcare. Findings released by the International Actuarial Association (2017) [94] explore the potential of personalized healthcare, a new landscape for genomics and the present obstacles to be overcome, together with the effects these breakthroughs will have on the insurance sector and advice for actuaries. Whereas, Hulsen et al. (2018) [95] examine some of the notable pitfalls in incorporating big data to achieve precision healthcare.
Both big data and AI offer extensive prospects for the prediction of health conditions and, subsequently, will support providers in developing bespoke insurance solutions and to calculate more precise pension obligations. Cirillo and Valencia (2019) [96] claim that pioneering machine learning tools, including deep learning and cognitive computing methods, are the direction that data-driven analysis tools will take in respect of biomedical databanks.
Big Tech firms are researching further innovations to support the diagnosis of health issues at their earliest occurrence. A current study by McKinney et al. (2020) [97] established that the Google Health AI system-the combined efforts of Google's DeepMind and a number of medical units-is more effective in diagnosing breast cancer than medical professionals. Additionally, research from Perez et al. (2019) [98] has examined the smartphone's ability to allow for the detection of certain conditions as a result of atrial fibrillation, taken from a databank of 400,000 Apple iPhone users. Nevertheless, key issues remain if the healthcare system is to really make the most of big data, such as the obtainment and standardization of heterogeneous datasets, the identification of applicable clean data, issues relating to whether providers have the consent of users to interpret their data, and the connectivity that exists between healthcare services so that all professionals can benefit from the same level of detail.
Bespoke healthcare and insurance might lead to numerous moral and ethical dilemmas, as well as extensive privacy and security issues. The privacy of consumers needs to be weighed against the positives that obtaining a massive amount of data and the subsequent assessment would provide. This complicated landscape will be further affected by regulation, while technologies might still progress in unexpected ways and lead to additional issues, although actuaries need to assess how to cope with personalized medicine risks in a more general manner that suits their clients.

Harvest Risk
Harvest risk and its effects on the food chain make it vital to assess for populations around the world. The insurance sector that serves related clients has been robust in looking for a more effective means of understanding the factors that put crops at risk. This is a historic challenge that the first ever human settlements would have been just as concerned with as we are today. While more sophisticated economies led to the partition of labor, the establishment of currencies to enable the exchange of goods allowed for substitutes to be put in place when crops failed. Insurance contracts were one result of this dynamic, allowing farmers to seek compensation for poor seasons.
The most significant challenges insurers have to overcome to protect against harvest failures included weather, insects, and issues related to soil quality, presenting a variety of short-and long-term factors. For a more thorough understanding, the obtainment of data is essential for directing insurers' positions. Kamilaris et al. (2018) [99] provide a summary of the big data factors being used to inform agricultural insurers. They note that the rise of smart farming is a significant breakthrough thanks to big data, contributing to the enhancement of production, response to environmental complications, supporting sustainability and overall food supply. Smart farming answers essential human food needs by strengthening the resource base and environmental value (Pretty (2008) [100]).
Agricultural big data has led to a focus on greater infrastructure so that data can be obtained and accessed ( The rewards of embracing big data are significant as far as agriculture is concerned. Through the use of innovative data gathering techniques, professionals and their governments are able to respond to issues better, adjusting their current operations to suit and keeping in mind predictive analytics that might direct their priorities for the coming seasons. Shirsath et al. (2019) [110] offer a fresh approach to creating insurance contracts for weather-related issues by merging crop growth modeling and optimization methods with agro-meteorological statistical assessment. Their proposed heuristic model is able to produce enhanced contracts that result in more accurate and regular payouts without the need for subsidies.
Inevitably, some downsides exist concerning big data usage, including sociopolitical factors that lead to certain businesses having unfair advantages, alongside ethical data collection issues, privacy concerns, and the accessibility of key data (Sykuta (2016)) [111]. Overall security and organization are also an issue (Kshetri (2014) [112]), together with the current absence of any regulatory structure in a big data context (Nandyala and Kim (2016)) [102]. All these factors play an important role when it comes to crop prediction, an important step for increasing accuracy in crop insurance policy pricing. Figure 2 presents the flowchart for assessing the expected crop yield. The chart shows how machine learning models can improve the prediction of a crop, and therefore improve insurance policy pricing by collecting relevant data and test it against a training model which serves as training data in order to predict the amount of crop being harvested given certain input conditions.

Catastrophe Risk
Big data is assuming ever greater importance in simulating and evaluating catastrophe risk, and multiple ways to harness its power exist, or are under development. Catastrophe risk encompasses several categories of major natural disasters, including storms, hurricanes, tornadoes, fires, geomagnetic events, earthquakes, floods, space events, and droughts. Financial data were used to investigate the modalities of pricing catastrophe risk: a 2008 study of 250 catastrophe bonds issued on the capital markets found that market evaluation of risk worked out 2.69 times greater than the expected long-term loss (Lane and Mahul (2008)) [114]. Big data also has a role in actuarial practice in rating insurance company risk through catastrophe modeling (Li and Powers (2007)) [115]. Bougen's (2003) [116] investigation into financing mechanisms for catastrophic event-related loss found that, while it is increasingly difficult to find insurance for catastrophic events in today's socalled 'risk society', leveraging the flexibility and adaptability of capitalism may illuminate ways in which our risk society itself may develop in the future.
The vast reservoir of information offered by big data is of particular value to insurance firms because it enhances their capacity to carry out more detailed and accurate simulations, thus ensuring better cost estimations of expected damage. Big data-driven risk and vulnerability evaluations also enable insurance companies to more precisely tailor premiums by taking account of any risk management practices undertaken by the insured.
Big data can also be gathered from social media and meteorological sites to predict the needs of the general public or particular groups, as shown by Nguyen et al. (2019) [121]. As digitalization proceeds apace, an increasing number of areas open up to the application of big data: not only insurance policy pricing but also supply chain optimization, and crisis response and prevention. As computational capacity also expands, new possibilities open up for the prediction, prevention, and assessment of catastrophe-related damage and its impact on society. For example, Hangan et al.

Catastrophe Risk
Big data is assuming ever greater importance in simulating and evaluating catastrophe risk, and multiple ways to harness its power exist, or are under development. Catastrophe risk encompasses several categories of major natural disasters, including storms, hurricanes, tornadoes, fires, geomagnetic events, earthquakes, floods, space events, and droughts. Financial data were used to investigate the modalities of pricing catastrophe risk: a 2008 study of 250 catastrophe bonds issued on the capital markets found that market evaluation of risk worked out 2.69 times greater than the expected long-term loss (Lane and Mahul (2008)) [114]. Big data also has a role in actuarial practice in rating insurance company risk through catastrophe modeling (Li and Powers (2007)) [115]. Bougen's (2003) [116] investigation into financing mechanisms for catastrophic event-related loss found that, while it is increasingly difficult to find insurance for catastrophic events in today's so-called 'risk society', leveraging the flexibility and adaptability of capitalism may illuminate ways in which our risk society itself may develop in the future.
The vast reservoir of information offered by big data is of particular value to insurance firms because it enhances their capacity to carry out more detailed and accurate simulations, thus ensuring better cost estimations of expected damage. Big data-driven risk and vulnerability evaluations also enable insurance companies to more precisely tailor premiums by taking account of any risk management practices undertaken by the insured.
Big data can also be gathered from social media and meteorological sites to predict the needs of the general public or particular groups, as shown by Nguyen et al. (2019) [121]. As digitalization proceeds apace, an increasing number of areas open up to the application of big data: not only insurance policy pricing but also supply chain optimization, and crisis response and prevention. As computational capacity also expands, new possibilities open up for the prediction, prevention, and assessment of catastrophe-related damage and its impact on society. For example, Hangan et al. (2016) [122] underline the importance of big data in simulating multiscale geophysical processes, such as wind phenomena, and generating accurate forecasts which will forewarn of effects on both natural and built environments.
3.5.1. Hurricanes Cox et al. (2018) [123] used big data to predict hurricane trajectories, and such data are also valuable in showing how damage due to high winds and flooding can be investigated. An examination of post-hurricane taxi and subway system resilience was conducted by Zhu et al. (2016) [124], using big data gathered from taxi firms and turnstile records across New York City after Hurricanes Sandy and Irene. The authors found differing recovery rates, with greater resilience shown after Irene than after Sandy, from which they inferred that road networks are more swiftly recovered than subways.
Other studies in which big data in relation to hurricanes was leveraged for actuarial application include Camara et al.'s (2018) [125] prediction of how such events can affect the stock market and Aldangady et al.'s (2017) [126] investigation of their effect on consumer spending, both of which offer valuable insights for insurance companies.

Tornadoes
Big data yielded by tornadoes has been used in a variety of recent studies. Chen et al. (2019) [127] generated 3D maps of tornado-struck areas by using unmanned aerial vehicle (UAV) images to record damage at roof level and evaluate the structural integrity of affected buildings. The authors found this technology to be an efficient and low-cost means of generating deep learning. Investigating the growth in the power of tornadoes, Elsner et al. (2018) [128] applied a statistical model to big data captured from tornado events occurring in the period 1994-2017 to evaluate the magnitude of increase. Controlling for variables including diurnal cycle, seasonality, climate variability, and the use of a new metric to estimate damage, they found a yearly increase of 5.5% in power. Lian et al. (2017) [129] gathered and analyzed data scattered across multiple sources, using data mining strategies to create comprehensive meteorological datasets from which they were able to map the spatio-temporal patterns of tornados.

Geomagnetic Events
Whereas the effects of some catastrophic events are only felt locally, those of geomagnetic events can be widespread. For example, overvoltage on the power grid can have a supra-regional, even continent-wide, impact. Moreover, if GPS systems go down, industries which are not directly related to the network, such as road-and seaborne freight, can also be heavily impacted, while long-term damage can also be sustained by basic infrastructural elements, such as pipelines being corroded. Hence, big data is invaluable in relation to forecasting and evaluating geomagnetic events, as it can predict long-term infrastructural damage as well as provide real-time information to managers of power plants who must decide whether to close facilities due to the risk of potential damage from overvoltage, frequently under pressure of time.
Big data has been used in a wide range of studies related to geomagnetic events. Srebov et al. (2020) [130] attempted to both explain and predict the dynamics of geomagnetic events through big data related to solar-terrestrial physics, while Pashova et al. (2019) [131] used big datasets gathered across multiple disciplines in an examination of strong geomagnetic storms. Several studies also leveraged big data to investigate how geomagnetic events affect GPS and other applications (see, for example, Blagoveshchensky and Sergeeva (2019) [132]; Astafyeva et al. (2014) [133]; Gvishiani et al. (2016) [134]; Belehaki (2015) [135].
An example of the use of big data in relation to geomagnetic events for financial applications is a recent study by Lemmerer and Unger (2019) [136], in which solar storm data were used to set prices for insurance on potential contingent claims for damage caused by solar eruptions. Figure 3 shows the structure of two solar flare forecasting models using a convolutional and a traditional Neural Network for learning. Using satellite data from sun observation enables neural networks to evaluate their forecasting performance quicker. Moreover, the collection and utilization of big data enables the various layers to reduce their forecasting error.

Earthquakes
Liu et al.'s (2020) [137] approach to introduce a high-performance blockchain system based on space-structured ledger and corresponding algorithms. This enables blockchain to operate in the Internet of Things (IoT). This blockchain-based communication allows pre-detection mechanisms using IoT and can therefore be installed in order to alarm the population in real time in the event of an emergence of an earthquake.
Clearly, insurance companies are obliged to work within certain mathematical and geophysical parameters (Kagan (2007)) [138]. Brillinger (1993) [139] lists the principal data necessary to calculate financial insurance premiums against earthquake damage, namely the temporal distribution of major earthquakes; ground motion in earthquake-hit locations; the attenuation of energy with distance; descriptions of damages; and real-life practice by country. The distribution of earthquakes (like other catastrophe risks) is long-tailed; hence the value of big data in determining insurance ruin potential across loss distributions. Mouyiannou and Styles (2017) [140] present seismic catastrophe models which are of use for insurers and re-insurers in estimating vulnerability. Tiampo et al. (2019) [141] used big data to gain a greater understanding of seismic hazards by analyzing ground motion simulations. Their findings enable a better estimation of stability and damage, as well as clarification on construction protocols for earthquake-resistant structures and are thus of value in estimating the potential damage due to catastrophic events, as well as mitigating related risks and losses. Figure 4 outlines the flow chart for determining the process of predicting human losses due to an earthquake. Big data plays a crucial role in the seismic data collection and the assessment of the importance of the associated features. Machine learning techniques help to learn from collected data which combination of features may lead to a potential earthquake and therefore to take counter measures ahead of time in case of an earthquake, which may lead to reduction of human losses. importance of the associated features. Machine learning techniques help to learn from collected data which combination of features may lead to a potential earthquake and therefore to take counter measures ahead of time in case of an earthquake, which may lead to reduction of human losses.

Floods
When evaluating the probability of floods, and thus to be able to predict costs, it is important to collect both satellite image data on geo-spatial structures, that is, the earth and the waters on it, and data on urban developments. Public and private insurance both have a significant role to play in flood risk management, as shown by a study using examples from France and the UK (Crichton (2007)) [142]. Big data are also useful in evaluating the cost of damage caused to residential and commercial structures by coastal flooding (e.g., Andre et al. (2013)) [143].
It is also important for insurance companies to estimate and assess damages directly or indirectly

Floods
When evaluating the probability of floods, and thus to be able to predict costs, it is important to collect both satellite image data on geo-spatial structures, that is, the earth and the waters on it, and data on urban developments. Public and private insurance both have a significant role to play in flood risk management, as shown by a study using examples from France and the UK (Crichton (2007)) [142]. Big data are also useful in evaluating the cost of damage caused to residential and commercial structures by coastal flooding (e.g., Andre et al. (2013)) [143].
It is also important for insurance companies to estimate and assess damages directly or indirectly caused by flooding to public health, biogeochemical cycling, and wetland ecology. In this regard, Schumann (2014) [144] investigates the use of big data gathered by airborne lidar to create digital elevation models (DEMs). Findings indicate that the more detailed data generated by such models for use in the modeling and forecasting of regional flooding give them an advantage over satellite images, which lack the necessary detail to allow flood simulation.
Moreover, an analysis of historic flood-related insurance claims can be carried out to inform premium pricing levels. One study aiming to identify the drivers of flood-related claims (impact of flood zones, building characteristics, individual vs. collective mitigation, properties subject to repeat damage, etc.) analyzed both claims made between 1978 and 2012 in the US and quantitative relationships among the factors determining payouts.
Big data can also illuminate the extent and characteristics of the demand for flood insurance. Browne et al. (2016) [145], for example, examined whether such demand was related to behavioral bias. Their findings indicated that insurance for high probability/low consequence risks was preferred to that for low probability/high consequence risks, which may explain why demand for flood coverage remains low.
Basis for an insurance company to price a flood insurance policy is to conduct a flood hazard study of the affected region. Figure 5 outlines how flood hazard studies are being generated.
Step 1 already shows the crucial step which affects the quality of the policy pricing accuracy. Since acquiring and evaluating the existing GIS dataset is necessary to get an overview about the imagery, topography, soils, and census data, big data and the methodologies applied in the above-mentioned literature can contribute significantly to the improvement in the amount and quality of collected data.

Fires
The type of fire is generally topographically determined (wildfire, bush fire, crown fire, etc.) and they can be deliberate or accidental. Big data is valuable in predicting and modeling both multiple possible causes (e.g., draughts, electric sparks, etc.) and the financial costs and damage fires cause. Tang (2018) [146] investigated the use of big data in predicting the probability and size of potential fires based on meteorological data, which is of clear use to insurers. Hence, fire prevention measures should be evaluated when calculating policy premiums. According to an exploratory study by Byungkwan (2018) [147] which leveraged prevention-related big data, there is no clear positive relationship between fire safety education and the prevention of fires caused by such education being inadequate. Gao and Wang (2016) [148] applied machine learning (ML) to ecological and economic data to assess the risk of fire in Yunnan Forest, with the aim of drawing up an integrated risk governance program. The authors' principal findings were that firstly, forest fire susceptibility can effectively improve the simulated events allocation process; and secondly, the variables of unit area, wildfire susceptibility, and gross value must all be taken into account in assessing the spatial distribution of risk of forest fire; hence, risk response planning must be differentiated along these dimensions. The authors' findings were utilized in the forest fire insurance program applied to Yunnan Forest.
Gao and Wang (2016) conducted a forest fire risk assessment based on ecological and economic value, which provides a possibility of integrated risk governance and references for forest fire insurance practice. They combined machine learning with event simulation to implement fire risk assessment of the Yunnan forest and found that forest fire susceptibility can be effectively improved through simulated events and differentiated risk response planning. study of the affected region. Figure 5 outlines how flood hazard studies are being generated.
Step 1 already shows the crucial step which affects the quality of the policy pricing accuracy. Since acquiring and evaluating the existing GIS dataset is necessary to get an overview about the imagery, topography, soils, and census data, big data and the methodologies applied in the above-mentioned literature can contribute significantly to the improvement in the amount and quality of collected data.

Fires
The type of fire is generally topographically determined (wildfire, bush fire, crown fire, etc.) and they can be deliberate or accidental. Big data is valuable in predicting and modeling both multiple possible causes (e.g., draughts, electric sparks, etc.) and the financial costs and damage fires cause. Tang (2018) [146] investigated the use of big data in predicting the probability and size of potential fires based on meteorological data, which is of clear use to insurers. Hence, fire prevention measures should be evaluated when calculating policy premiums. According to an exploratory study by Byungkwan (2018) [147] which leveraged prevention-related big data, there is no clear positive

Climate Risk
The umbrella category of climate risk is considered to include all the individual catastrophe events considered above, as the climate system itself may be seen as a combination of all such risks. Although climate risk events occur due to a wide range of causes, strong relationships exist between certain types of catastrophe risks; for example, one may be more likely to occur in conjunction with, or be caused by, another: severe air contamination may follow a wildfire, which may itself have been caused by an earthquake. To produce the comprehensive picture of climate risk required through modeling, only the systematic analysis, including cascade and feedback effects, enabled by big data is sufficient.
To carry out systematic, big data-driven assessment of climate risk, researchers must combine analysis of historic data streams with information yielded by real-time monitoring. Instruments which can give researchers a useful generalized overview included the annual Global Climate Risk Index (Eckstein et al. (2020)) [149]. The 2020 edition offers an analysis of the degree to which both countries and regions have been impacted by damage related to meteorological events including storms, floods, and heatwaves.
There has been discussion of how adaptation of insurance instruments can have a role in meeting the challenges of climate change, for example due to the increasing risk of flood (Botzen and van den Bergh (2008)) [150]. Given the pace of climate change, the creation of new financial instruments will also be necessary if the insurance industry is to meet the growth in risks related to environmental damage. Climate risk insurance, like insurance against natural events in general, is being progressively financialized, for example through the issue of catastrophe bonds, which Keucheyan (2018) [151] describes as a neoliberal response to the increasing natural catastrophe-related cost borne by the insurance sector since the 1990s.
Socially responsible insurance underwriting can also make use of big data in assessing climate change risk, in terms of ensuring this type of risk is shared fairly. With this in mind, Zvezdov and Rath (2016) [152] caution that, as part of the evolution of the reinsurance market, regulations must be drawn up to ensure fairness in risk transfer as part of the wider regulatory environment of the financial markets.
Among big data applications which are of increasing relevance to the insurance section are risk assessments of disruption to supply chains, for example the agri-food supply, due to climate change. All stakeholders in agriculture are having to adapt to climate change, including actors in supply chains; therefore, insurance firms must update their forecasts of potential loss. Iuliia and Aleksan (2016) [153] summarize the technologies which can be utilized to apply big data to analyses of agribusiness value chains in developing countries, as well as technologies under development or new to the market; moreover, they present four case studies in which smallholder producers have been successfully targeted by big data applications.

Cyber Risk
Cyberattack falls within the larger category of terrorist attack, which in the majority of cases is a politically motivated attack on vulnerable elements of national infrastructure, such as transport networks; it can, however, also be a weapon in commercial conflict. When critical elements of the energy infrastructure are damaged, there is a high danger of blackouts. Apart from cyberattack, such damage may also be due to human error, negligence, lack of preparation, training, or motivation, or because the personnel responsible for monitoring systems and sending warnings that overloads and outages are about to occur have fallen ill or are otherwise physically unable to carry out their work.
Power providers and insurance companies are extremely vulnerable to the damage which can follow the outages which follow disruption of the national grid, as demonstrated by a study carried out by Mill and Jones (2016) [154]. The authors gathered data from multiple real-life disruptions to the US national grid to estimate insurance claims, and found that between 3 and 64% of total loss costs across all cases were insured. Big data can also be used to simulate different causes of grid disruption and, hence, enhance the accuracy of premium pricing. A recent study by Biener et al. (2015) [155] extracted data on 994 incidents of cyber loss from an operational risk database, and analyzed their statistical properties to empirically assess the degree to which cyber risk is insurable. Their findings indicate that developing a sustainable cyber insurance market will be hindered by a lack of data, as well as gross information asymmetries, and the fact that losses are, to a significant degree, interrelated.
Eling and Schnell (2016) [156] reached similar conclusions, finding that systematically insuring against cyber risk is problematized by the lack of both data and appropriate modeling methods, as well as the high risk of change and the even higher one of accumulation risks. Despite the conclusions of the cited authors, however, other researchers have found that big data has the potential to generate the necessary data. It has been suggested that data can be classified through the use of combined supervised learning techniques to create security-aware information; hence, any data which could damage either insurance providers or clients remains confidential. Gai et al. (2016) [157] have applied this approach to enhance the cloud-based cyber risk management of financial big data.

Summary
Summarizing the findings, we provide an overview of the various machine learning (ML) and deep learning (DL) algorithms and their (potential) field of applications within the insurance sector in Table 1. We assign the corresponding ML and DL algorithms to the following six insurance categories: (1) Automobile, (2) Mortality Modelling, (3) Healthcare, (4) Harvest Risk, (5) Catastrophe Risk (Hurricanes, Tornadoes, Geomagnetic Events, Earthquakes, Floods, Floods) (6) Climate Risk. We distinguish between three types of machine learning algorithms: (1) Supervised Learning Algorithms which are task-based, (2) Unsupervised Learning Algorithms which are data-driven, (3) Reinforcement learning which reacts to its environment. Among supervised learning algorithms we can further sub-classify into the following categories: Classification, Regression, Ensembling. Among unsupervised learning algorithms we distinguish between: Association, Clustering, Dimensionality Reduction. Table 1 relates the most applied algorithms to the insurance categories we investigate and which technology and framework is used to perform them, such as Internet of Things (IoT), Blockchain, Virtual reality, Quantum computing, Blockchain, Cloud computing, and Cybersecurity. Moreover, we highlight some examples of how they are applied as well as their level of efficiency.
For a detailed description of the functionality of each algorithm, we refer to Shaw (2019) [158]. What we can see from our analysis is that the level of efficiency does not depend on the type of algorithm or technology being used, but rather depends on the nature of the underlying business and the associated complexity of the underlying system, e.g., the complexity of the forecasting of the climate system, or the availability, consistency, and robustness of historical data and patterns such as in automotive or mortality modeling.
We further provide an overview and classification of the various machine learning (ML) and deep learning (DL) techniques in Figure 6.

Conclusions
The role of big data in the insurance sector is growing and will continue to grow. The power of insurers to predict the probability of future occurrences, and thus price premiums accurately, is enormously enhanced by analyzing big data related to past contingent claim events. Furthermore, big data-based simulation models enable insurers to predict an ever-larger variety of potential outcomes, which is of significant help for companies operating in the areas of risk management and hedging. The current paper attempts to set out the current uses, advantages, and limitations of big data within the insurance sector, and suggests how, and in which actuarial fields, it may be applied in the future. The fields discussed are automobile insurance, mortality modeling, healthcare, harvest risk, and cyber risk; catastrophe risk such as storms, hurricanes, tornadoes, geomagnetic events, earthquakes, floods, and fires; and climate risk in general. In all these areas, big data can be leveraged to improve the accuracy of predictions and consequently of policy premium pricing. We assess the current applications of big data in each field and highlight the importance of data processing capabilities due to the immense amount and vast speed at which data collection already takes place and is likely to even accelerate.
In the automobile sector, insurance policies use statistical techniques to infer ratings out of accident reports. With the emerging field of IoT, insurance policy pricing will be more accurate due to more data availability of car driver data in motion. For mortality modeling, machine learning techniques such as neural networks enable insurance companies to incorporate individual features

Conclusions
The role of big data in the insurance sector is growing and will continue to grow. The power of insurers to predict the probability of future occurrences, and thus price premiums accurately, is enormously enhanced by analyzing big data related to past contingent claim events. Furthermore, big data-based simulation models enable insurers to predict an ever-larger variety of potential outcomes, which is of significant help for companies operating in the areas of risk management and hedging. The current paper attempts to set out the current uses, advantages, and limitations of big data within the insurance sector, and suggests how, and in which actuarial fields, it may be applied in the future. The fields discussed are automobile insurance, mortality modeling, healthcare, harvest risk, and cyber risk; catastrophe risk such as storms, hurricanes, tornadoes, geomagnetic events, earthquakes, floods, and fires; and climate risk in general. In all these areas, big data can be leveraged to improve the accuracy of predictions and consequently of policy premium pricing. We assess the current applications of big data in each field and highlight the importance of data processing capabilities due to the immense amount and vast speed at which data collection already takes place and is likely to even accelerate.
In the automobile sector, insurance policies use statistical techniques to infer ratings out of accident reports. With the emerging field of IoT, insurance policy pricing will be more accurate due to more data availability of car driver data in motion. For mortality modeling, machine learning techniques such as neural networks enable insurance companies to incorporate individual features to calculate and predict conditional likelihoods that might be linked to mortality. The healthcare insurance sector benefits enormously from Big Data analytics since pre-diagnostics data collection leads to a more accurate prediction possibility about how likely an individual will claim a certain healthcare benefit.
Through the utilization of satellite data and predictive analytics, insurers can better assess the probability of droughts, floods, heat waves, etc. which might lead to a partial or total loss of harvest. Data mining strategies to create comprehensive meteorological datasets allow also more and more accurate predictions about storms such as occurrences of hurricanes, tornadoes, or even geo-magnetic events by processing observed real-time data from the sun. With respect to real-time observation, blockchain-based pre-detection mechanisms using IoT are already capable of alarming the population in real time in the event of an emergence of an earthquake.
One of the most increasing challenges insurers face is the assessment of climate risk. Due to very long durations of periodic changes in climate, it is impossible to rely on historic data to assess the damage done to the economy caused by climate change, since every century, the industrial progress as well as population density has been on a different level and thus caused a different amount of economic damage. Nevertheless, with data analytics and adaptation of insurance instruments, the insurance policy pricing is increasingly capable of taking interdependent and causal effects into account.
In general, a high penetration of insurance policy pricing is found in all the fields listed above, except cyber risk. Modeling cyber risk for insurance purposes, and hence developing a sustainable market in cyber insurance, remains extremely difficult due to a lack of data and gross data asymmetries. The potential to solve these problems lies in the use of big data, alongside ML techniques, including deep or supervised learning.