Next Article in Journal
Unmasking Greenwashing in Finance: A PROMETHEE II-Based Evaluation of ESG Disclosure and Green Accounting Alignment
Previous Article in Journal
The Mack Chain Ladder and Data Granularity for Preserved Development Periods
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Predicting High-Cost Healthcare Utilization Using Machine Learning: A Multi-Service Risk Stratification Analysis in EU-Based Private Group Health Insurance

by
Eslam Abdelhakim Seyam
Department of Insurance and Risk Management, College of Business, Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh 13318, Saudi Arabia
Risks 2025, 13(7), 133; https://doi.org/10.3390/risks13070133
Submission received: 1 June 2025 / Revised: 25 June 2025 / Accepted: 3 July 2025 / Published: 8 July 2025

Abstract

Healthcare cost acceleration and resource allocation issues have worsened across European health systems, where a small group of patients drives excessive healthcare spending. The prediction of high-cost utilization patterns is important for the sustainable management of healthcare and focused intervention measures. The aim of our study was to derive and validate machine learning algorithms for high-cost healthcare utilization prediction based on detailed administrative data and by comparing three algorithmic methods for the best risk stratification performance. The research analyzed extensive insurance beneficiary records which compile data from health group collective funds operated by non-life insurers across EU countries, across multiple service classes. The definition of high utilization was equivalent to the upper quintile of overall health expenditure using a moderate cost threshold. The research applied three machine learning algorithms, namely logistic regression using elastic net regularization, the random forest, and support vector machines. The models used a comprehensive set of predictor variables including demographics, policy profiles, and patterns of service utilization across multiple domains of healthcare. The performance of the models was evaluated using the standard train–test methodology and rigorous cross-validation procedures. All three models demonstrated outstanding discriminative ability by achieving area under the curve values at near-perfect levels. The random forest achieved the best test performance with exceptional metrics, closely followed by logistic regression with comparable exceptional performance. Service diversity proved to be the strongest predictor across all models, while dentistry services produced an extraordinarily high odds ratio with robust confidence intervals. The group of high utilizers comprised approximately one-fifth of the sample but demonstrated significantly higher utilization across all service classes. Machine learning algorithms are capable of classifying patients eligible for the high utilization of healthcare services with nearly perfect discriminative ability. The findings justify the application of predictive analytics for proactive case management, resource planning, and focused intervention measures across private group health insurance providers in EU countries.

1. Introduction

Health expenditure continues to increase among European Union member countries, and the total expenditure on health amounted to 9.9% of the GDP in 2019, posing a significant challenge to insurance and public finance (OECD 2021). These financial pressures have been compounded by broader economic challenges affecting European health systems (Thomson et al. 2013). At the core of managing healthcare is the highly asymmetric cost distribution, such that about 20% of patients contribute to 80% of all healthcare expenditures (Hayes et al. 2016). This concentration of expenditure among users has deep implications for the allocation of resources, the coordination of care, and the economic sustainability of insurance systems for health. The COVID-19 pandemic has further accelerated these issues, such that health systems are experiencing unprecedented pressures while managing postponed care and changing health demands (Moynihan et al. 2021).
Private health insurers across EU countries are now presented with a pressing need to proactively target and manage patients who are likely to be costly, as opposed to dealing with costly episodes after their occurrence. Predicting healthcare utilization is intrinsically complex due to the multifaceted nature of service consumption, which includes primary care encounters, specialty procedures, diagnostics, and emergency services (Billings et al. 2016). History has seen traditional risk modeling techniques drawn heavily from demographic- and history-based cost factors, without necessarily capturing the subtle patterns of service utilization preceding costly episodes (Rose et al. 2017). The advent of machine learning methods presents new opportunities to discover subtle patterns of administrative data that are not necessarily revealed using traditional statistical methods. Machine learning methods can handle massive amounts of disparate data to identify intricate relationships between patient attributes, patterns of service utilization, and impending healthcare expenditures.
The applicability of machine learning to revolutionize health predictions has been illustrated across several clinical domains, but complete implementation within private health insurance settings is limited. Modern developments of predictive analytics have exhibited impressive promise for health applications, especially for population health management and risk stratification (Rajkomar et al. 2018b). Machine learning models outperform conventional models based on traditional regression techniques for different health predictions such as hospital re-admissions, emergency department encounters, and mortality risk (Christodoulou et al. 2019). Still, most previous research has concentrated on particular clinical conditions or solitary healthcare domains that confine their generalizability to full health insurance situations where patients seek a variety of services from multiple specialists (Kansagara et al. 2011). Moreover, most previous assessments have been carried out within North American health systems, and relatively little evidence has been drawn from European settings where universal coverage and varying payment arrangements might affect utilization patterns. This geographical restriction is especially detrimental considering the significant variations in healthcare arrangements, payment mechanisms, and patient access patterns between American and European health systems.
Combining several categories of services into an integrated predictive model is an important gap in existing studies. Individual service types, such as emergency department utilization (Hong et al. 2013) or hospital re-admissions (Kansagara et al. 2011), have been studied separately by previous works. Few comprehensive analyses have merged the complete range of services into a single model. This is especially challenging in private health insurance markets across EU countries where patients conventionally have access to varied services at the primary, secondary, and tertiary levels. Identifying how patterns of utilization across several service groups all combine to drive expensive episodes is important for designing effective intervention measures and optimizing resource planning. The lack of multi-service analytical models has prevented comprehensive risk stratification tools from assisting proactive population health management initiatives.
Current methodological practices in healthcare utilization prediction are further hindered by various limitations that hamper their real-world usability. The majority of previous works have utilized conventional statistical methods that are based on linear relationships between predictors and outcome measures that could fail to identify intricate interaction effects and non-linear relationships that are present in healthcare consumption (Beam and Kohane 2018). Past research has further revealed limited relative comparisons of various algorithms through similar datasets and measurement criteria, hindering relative comparisons of different methods for their suitability to healthcare predictive applications. The absence of standardized frameworks has brought about uncertainty regarding the best algorithmic selection for various healthcare domains and prediction goals.
This project fills these important knowledge gaps by constituting and validating predictive machine learning models for anticipating high-cost healthcare utilization based on rich administrative claims data, which compiles information from health group collective funds operated by non-life insurers across EU countries. Research aims are comparing how three different machine learning algorithms perform using the same test criteria, determining significant predictor factors across several service domains, and exploring how clinical and administrative implications arise from applying predictive analytics towards risk stratification within private group health insurance settings across EU countries. This study systematically compares logistic regression with elastic net regularization, random forest ensemble methods, and support vector machines based on a sample of 176,032 insurance beneficiaries across eight different domains of healthcare services.
The contribution of this study lies in presenting the first multi-service examination of healthcare utilization prediction from private health insurance providers across EU countries, illustrating the real-world applicability of machine learning methods to health insurance management and shedding light on the relative importance of varying utilization patterns to forecast costly episodes. We make use of new measures of service diversification that reflect the breadth of health consumption across several domains beyond conventional intensity-focused measures. This study also presents empirical findings by comparing various algorithms for predictive capability in healthcare, which fills an important methodological void in the existing literature. This study provides applied recommendations for health insurance organizations looking to implement predictive analytics for resource allocation and the management of populations.
The rest of this paper has the following structure: Section 2 provides an overview of the existing literature on healthcare utilization forecast and the applications of machine learning in health insurance settings. Section 3 describes the methodology, such as data sources, variable definitions, and analytical methods applied for the comparative assessment of machine learning algorithms. Section 4 describes empirical findings, such as model comparisons, feature importance scores, and the validation of predictive accuracy applied to various algorithmic methods. Section 5 provides the implications of the findings for health policy and practice, comparisons to the existing literature, and limitations. Section 6 concludes by summarizing key takeaways and implications for applying predictive analytics to private group health insurance providers across EU countries and avenues for further research.

2. Literature Review

The role of predictive analytics in medicine has changed radically over the course of a decade, facilitated by the enhanced availability of electronic health records and administrative claims, as well as advances in computing algorithms (Beam and Kohane 2018). Prediction efforts within medicine initially concentrated mainly on clinical decision support and diagnosis but have since grown to include resource management, population health management, and cost-containment measures. The theoretical basis of health utilization prediction is derived from health economics scholarship and preeminently from Anderson’s behavioral model of health services utilization (Andersen 1995), which held that health utilization is influenced by predisposing factors, enabling factors, and need factors. This model has been further developed to account for modern knowledge from behavioral economics and health psychology, acknowledging that utilization is determined by a web of individual choices, provider attributes, and systemic limitations.
Modern machine learning methods for healthcare utilization prediction have produced impressive outcomes in a number of different contexts and populations. Deep learning methods have long been shown to be especially suited for processing nuanced, multi-dimensional healthcare data (Miotto et al. 2018). Convolutional neural networks have been successfully applied to medical images for diagnosis, and recurrent neural networks have been demonstrated to be effective for extracting temporal patterns from electronic health records (Shickel et al. 2017). Ensemble methods, such as random forests and gradient boosting machines, have become increasingly popular since these methods are capable of dealing with mixed data and delivering interpretable measures of feature importance (Chen and Guestrin 2016). These methodological developments have allowed researchers to identify and quantify intricate, non-linear relations between patient attributes and health outcomes not discernible through traditional statistical analyses.
A number of landmark papers have formed the basis for machine learning-based healthcare prediction. The effort by (Rajkomar et al. 2018a) proved that deep learning models could be used for a variety of clinical predictions, such as electronic health records across several hospitals, and achieved levels of performance often surpassing traditional risk scores. Likewise, (Avati et al. 2018) created neural network-based models for mortality risk prediction that outperformed current clinical prediction methods by a significant margin. These papers proved that important comprehensive feature engineering and using temporal dynamics are important to include when developing healthcare prediction methods. Most such papers, though, have been carried out within academic medical centers with advanced infrastructure for information management, and their generalizability to wider health systems is therefore questionable.
Research aimed specifically at predicting healthcare utilization and cost has uncovered several important findings about high-cost episodes of healthcare. (Duncan et al. 2011) undertook exhaustive examinations of groups of patients who are high-cost users of services and have characterized patterns of service utilization distinguishing these groups from ordinary users. Their effort highlighted the need to look across multiple service classes since high-cost episodes are often coordinated across specialty services. (Tamang et al. 2017) employed machine learning methods to forecast healthcare costs from insurance claims and identified ensemble methods that outperformed standard regression methods. However, their examination considered a single forecast horizon and did not fully address temporal utilization patterns.
Private health insurance markets across EU countries pose distinct challenges and opportunities for utilization forecast. Unlike public health systems, private insurance arrangements vary significantly across EU member states, with different coverage levels, co-payment structures, and provider networks influencing utilization patterns (Busse and Blümel 2013). Private insurance beneficiaries often face varying financial incentives and access constraints that can create distinct risk profiles compared to public system populations (Mossialos et al. 2017). (Geissler et al. 2011) compared utilization patterns in a number of European countries and identified considerable variability in service intensity and cost patterns related to differences in health system arrangement and payment processes. Such evidence indicates that models of prediction developed within a specific health system might not apply across systems without significant adaptation.
More recent developments in ensemble methods and feature selection strategies have held significant promise for their application to health. (Xu et al. 2022) contrasted several competing machine learning algorithms for the prediction of health utilization and found that the random forest and gradient boost algorithms consistently outperform traditional statistical methods. Their findings underlined the value of service diversity measures, which reflect the range of services consumed by individual patients. (Orhan and Kurutkan 2025) devised elaborate feature engineering techniques that combine demographic factors with temporal patterns of service consumption and produced significant gains in the accuracy of predictions for expensive episodes. These papers have started to show the potential of multi-domain approaches to health prediction that go beyond individual-domain analyses. The dimension of service diversity has become a critical mediating factor for predicting the utilization of healthcare services. Using multiple domains of healthcare services, (Olaoye 2025) discovered that such patients had a much greater likelihood of becoming high-cost users, indicating that consumption breadth could be a higher predictor than intensity per service type. (Yang 2022) created explainable artificial intelligence methods for examining multi-service utilization mechanisms and identified intricate interaction effects among various types of health consumption. The implications of these findings are significant for the coordination of care and the management of populations.
Notwithstanding these advancements, a number of imperatives still exist within the existing literature relating to healthcare utilization predictions. First, most previous studies focused primarily upon single clinical populations or one domain of healthcare; thus their generalizability to comprehensive private health insurance settings where patients access a variety of services from different specialties is limited (Patel et al. 2021). Secondly, much of previous research has drawn upon North American healthcare settings, and there has been very little evidence from private health insurance contexts across EU countries where payment mechanisms and different organizational arrangements might shape utilization patterns. Thirdly, relatively little has been done to compare a variety of different machine learning methods using comparable datasets and criteria, which hinders our understanding of the relative strengths of various approaches (Martinez et al. 2023). Fourth, little has been said by the existing literature about applying prediction models to real-world private insurance settings, such as considerations of interpretability, integration into existing processes, and computation efficiency.
Methodological limitations of the existing literature include a lack of consideration of temporal dynamics of patterns of healthcare utilization. (Wong et al. 2020) remarked that most existing studies use cross-sectional analyses that are not capable of capturing how patterns of utilization might change over time and therefore risk missing significant predictive cues. (Garcia et al. 2022) called for longitudinal methods to spot such patients who are transitioning to costly episodes before costly interventions are required. (Kumar et al. 2021) further pointed out missing standardized assessment frameworks for comparing different methods of prediction across different studies and healthcare settings.
This article fills these voids by developing and comparing several models of machine learning for prospective high-cost healthcare utilization predictions based on rich administrative data, which compiles information from health group collective funds operated by non-life insurers across EU countries. We utilize service utilization patterns across eight distinct health domains to capture the multiplicity of healthcare consumptions that define high-cost episodes. Comparing logistic regression, random forest, and support vector machine algorithms based on the same datasets and evaluation measures, this study provides empirical evidence about the relative accuracies of different methods for healthcare utilization predictions. Additionally, our work entails a detailed investigation of feature importance patterns and provides insights into the relative importance of different service categorizations and patient attributes towards high-cost episodes.

3. Methodology

The research used a predictive modeling strategy to determine which patients are likely to have high levels of healthcare utilization based upon rich administrative datasets within private health insurance programs across EU countries. The design of the research consisted of an observational, cross-sectional approach aimed at developing and validating predictive machine learning models which had the capacity to differentiate between high- and low-utilization groups based upon service utilization patterns across several domains of care (Hastie et al. 2009). The predictive design for the study aimed to inform proactive management and resource allocation instead of determining causal relationships between measures, in line with frameworks of population health management (Kindig and Stoddart 2003).
The database consisted of administrative claim records from 176,032 insurance beneficiaries, a representative sample of patients covered by private health insurance policies across EU countries. The records included detailed information about the utilization of health services from eight distinct classes of service: analysis and lab services, dentistry, diagnostics, endoscopy, hospital stays, mammography, operations, and general medical consultations. The records included demographic information, such as gender, age, and relation to primary policyholder, and policy attributes such as duration of coverage (Iezzoni 2003). The temporal range of the records covered an entire calendar year to capture seasonal fluctuations in healthcare utilization. All personal identifiers were excluded from the database to protect patient privacy and comply with European regulations on the privacy of personal data (European Parliament 2016). Variable definitions conformed to standard conventions in health services research, taking specific care to design meaningful measures of intensity and diversity of service utilization (Andersen 1995).
It is important to note that across EU countries, private health insurance typically operates in a complementary relationship with public healthcare systems, where private coverage often provides additional services, reduced waiting times, or enhanced access to specialist care. Our dataset represents private insurance claims, but beneficiaries likely also had concurrent public healthcare coverage. This dual-coverage context may influence utilization patterns, as patients may use private insurance for specific service types while relying on public systems for others.
The full variable descriptions and statistical classifications of all analyzed variables are listed in Table 1. The dependent variable, intensive healthcare utilization, was a binary indicator for patients whose overall healthcare expenditure passed the 80th percentile cutoff, mathematically represented by
High Utilization i = 1 if j = 1 8 Cos t i , j P 80 0 otherwise
where Cos t i , j is the cost for patient i in service category j and P 80 is the 80th percentile of the overall cost distribution (EUR 50). The threshold approach resonates with the healthcare economics literature that is aware of the highly skewed shape of healthcare cost distributions and the extreme effect of expensive users on system utilization (Cohen and Yu 2012).
A key derived variable of particular importance was service diversity, which we calculated by determining how many different service types each patient had received, adopting methods developed by (Duncan et al. 2011):
Service Diversity i = j = 1 8 1 ( Services i , j > 0 )
where 1 ( · ) represents an indicator function that takes a value of 1 if patient i consumed service type j and 0 otherwise. This captures breadth and not merely the intensity of health consumption, correcting for deficiencies noted in existing utilization analyses (Tamang et al. 2017).
The analytical model applied three different machine learning methods to facilitate the full comparability of predictive accuracy and methodological stability, adopting best practices from health prediction studies (Rajkomar et al. 2018b). Logistic regression using elastic net regularization served as the baseline, which combined traditional methods’ interpretability and sophisticated regularization to avoid overfitting (Zou and Hastie 2005). The elastic net penalty function minimizes the objective function:
min β 0 , β 1 N i = 1 N y i ( β 0 + x i T β ) log ( 1 + e β 0 + x i T β ) + λ P α ( β )
where the penalty term is given by
P α ( β ) = ( 1 α ) 1 2 β 2 2 + α β 1 = j = 1 p 1 α 2 β j 2 + α | β j |
The parameter α [ 0 , 1 ] controls the balance between ridge ( α = 0 ) and lasso ( α = 1 ) penalties, while λ 0 controls the overall regularization strength (Tibshirani 1996).
Random forest algorithms were applied as an ensemble method that could handle non-linear relationships among predictors and outcomes (Breiman 2001). The algorithm creates B decision trees based on bootstrap samples of the training sample and random subsets of predictor variables for each split. For each tree b = 1 , , B , a bootstrap sample Z b is drawn from Z = { ( x i , y i ) } i = 1 N . At each internal node of tree T b , a random subset of m of the p total features is chosen, often setting m = p for classification problems. The ultimate prediction is obtained through majority voting of individual tree predictions:
y ^ R F ( x ) = majority vote T ^ 1 ( x ) , T ^ 2 ( x ) , , T ^ B ( x )
For probability estimation, the random forest prediction is
p ^ R F ( x ) = 1 B b = 1 B p ^ b ( x )
Here, p ^ b ( x ) is the output from tree b.
Support vector machines using radial basis function kernels were added to assess the capability of non-linear classification hyperplanes, according to implementations outlined by (Cortes and Vapnik 1995). The SVM optimization problem aims to determine the best separating hyperplane by solving
min w , b , ξ 1 2 w 2 + C i = 1 N ξ i
subject to the following conditions:
y i ( w T ϕ ( x i ) + b ) 1 ξ i , ξ i 0 , i = 1 , , N
where ϕ ( x i ) transforms the input features into a space of higher dimensions using the RBF kernel:
K ( x i , x j ) = exp ( γ x i x j 2 )
The values of C and γ regulate the strength of regularization and kernel width, respectively.
Model training and validation processes adopted best practices of clinical machine learning (Steyerberg 2019). The dataset was divided randomly into the training (70%) and testing (30%) sets based on stratified sampling to obtain a balanced proportion of high- and low-utilization cases within each of these sets, as advocated by (Kohavi 1995). The division process did not alter the original class balance in either subset:
P ( High Utilization Training ) = P ( High Utilization Testing ) = P ( High Utilization Full Dataset )
Cross-validation was performed for logistic regression and random forest models using 10-fold stratified steps, and 5-fold cross-validation was performed for support vector machines due to computational limitations (Stone 1974). For k-fold cross-validation, the training dataset was divided into k roughly equally sized groups, and model accuracy was estimated as
CV Score = 1 k i = 1 k Performance ( Model trained on D D i , tested on D i )
where D i represents the i-th fold and D D i represents the remaining k 1 folds.

4. Results

The analysis revealed exceptional predictive performance across all three machine learning approaches, with area under the curve values exceeding 0.99 for each algorithm. These results represent some of the highest prediction accuracies reported in the healthcare utilization literature, demonstrating the power of multi-service analytical frameworks. Service diversity emerged as the dominant predictor across all models, while specific service categories revealed striking patterns that fundamentally challenge traditional approaches to healthcare risk assessments. The findings provide compelling evidence for the feasibility of near-perfect identification of high-cost patients using readily available administrative data.

4.1. Sample Characteristics and Utilization Patterns

A descriptive examination of the 176,032-patient database identified significant differences between populations of high and low utilization based on all measured attributes. Data presented by utilization status in Table 2 provide compelling evidence that supports our method of risk stratification. Exactly 20.3% of patients formed the group of high utilizers by the 80th percentile threshold definition, but their utilization pattern differed starkly from that of the remaining 79.7% of patients.
The size of differences among utilization groups is remarkable. Policy length averaged 54% greater for high utilizers (3.83 vs. 2.48 years), which implies that longer periods of coverage, which drive utilization, enable complex care relationships to be built. The 99-fold difference between the utilization of analysis services (0.99 vs. 0.01) is most remarkable, which confirms that laboratory and test services are a key pathway to costly episodes. The service diversity measurement revealed a 45-fold difference (2.27 vs. 0.05), quantitative evidence for the intrinsic importance of multi-domain health consumption to forecast costly episodes.
The highly skewed distribution of overall health expenditure throughout the study group is reflected in Figure 1. The histogram shows the familiar “hockey stick” pattern of health expenditure, where nearly all patients account for very small costs and a small subset of patients account for exponentially greater expenditure. The density overlay proves that the distribution of costs is log-normal and how there is an 80th percentile cutoff of EUR 50 which captures exactly where expenditure takes off.
Correlation analysis depicted in Figure 2 illustrates refined patterns of service co-consumption that shed light on care continuum processes. Analysis services had strongest correlations with medical visits ( r = 0.55 ) and mammography services ( r = 0.48 ), which indicate coordinated processes of diagnosis. Interestingly, operations exhibited weak correlations across all other services ( r < 0.04 ), and hence, surgical operations are a separate dimension of utilization not related to any of the other health consumption patterns.

4.2. Machine Learning Model Performance

All three algorithms produced outstanding predictive performance far beyond benchmarks set in the literature for health prediction. A table listing complete performance measures illustrating near-perfect discrimination between patients of high versus low utilization is available in Table 3.
Random forest performed the best overall with an AUC of 0.997 and an accuracy of 97.9%, correctly classifying 96.0% of true high utilizers and keeping 98.1% specificity. Such levels of performance are near the theoretical limits for binary classification and indicate superb quality of predictive information embedded in multi-service utilization patterns. The uniformity of algorithmic performance (AUC range: 0.995–0.997) gives convincing evidence that relationships are indeed consistent and not a function of particular modeling assumptions. Figure 3’s ROC curves offer visual validation of excellent discrimination, and all three curves converge towards the upper-left quadrant of the ROC space. The minimal difference between curves illustrates that algorithmic selection has relatively little effect on the overall predictive ability for this application and indicates that the multi-service approach is picking up essential patterns that are discernible through varying analytical frameworks within healthcare utilization.

4.3. Individual Model Results and Feature Importance

Logistic regression using elastic net regularization had excellent performance combined with complete interpretability of the findings. The full coefficient estimates are given in Table 4, and these show the impressive predictive capacity of individual service types and the relative importance of varying utilization patterns.
Logistic regression findings produce unprecedented effect sizes that defy conventional wisdom about health service utilization prediction. The individual service that best predicted overall dentistry services had an odds ratio of 55.46, which indicates that an additional service of any kind boosts the odds of high utilization by over 50 times. Such a finding occurring for dentistry services is especially significant, as dental services are generally considered a discrete service field with minimal referral and integration into overall healthcare episodes. The diversity of services produced the second-largest effect size (16.64), quantitatively confirming the multi-service strategy for risk prediction.
Random forest analysis reinforced identified importance patterns from logistic regression and further picked up non-linear relationships. The variable importance rankings due to a mean decrease in node impurity are listed in Table 5 and show an impressive pattern of agreement with logistic regression coefficients.
Service diversity registered the highest importance value, which acted as a baseline for relative importance scores. The consistency between coefficients from logistic regression and importance rankings from the random forest constitutes powerful evidence of robustness of detected predictive relationships among essentially distinct algorithmic paradigms. SVM implementation needed to make computational adjustments but attained comparable performance based on operational limitations. The optimized model parameters and performance descriptors are given in Table 6, which shows the consistency of predictive relations among linear, ensemble, and kernel-based methods.

4.4. Model Generalization and Robustness

The impressive generalization ability exhibited by all models is firm evidence of the real-world applicability of these methods within clinical settings. The contrast between test set and cross-validation performances is shown in Table 7. There is very little overfitting and high model stability.
The variations between test set and cross-validation performances were very small (<0.001 AUC units), and no overfitting was evidenced. This stability indicates that the identified relationships are not artifacts of a specific dataset but reflect real patterns of healthcare utilization. Direct visual comparisons of most significant predictive features based on logistic regression and random forest methods are presented in Figure 4. Similar to how algorithmic processes differ, both methods ranked dentistry services and service diversity as the most significant predictors, which adds to the confidence levels of substantive interpretation.
The confusion matrices of Figure 5 show the real-world classification accuracy of each model for all possible classification outcomes. The random forest had the best trade-off between specificity and sensitivity and identified 10,267 out of 10,698 actual high utilizers (96.0% sensitivity) and misclassified 801 out of 42,111 actual low utilizers (1.9% false positive rate).
These findings illustrate that machine learning methods can perform nearly to perfection when identifying high-cost utilizers of healthcare through complete administrative datasets. The fact that exceptional performance is consistently held by three conceptually distinct algorithmic methods is compelling evidence of multi-service utilization’s utility and reliability within private health insurance settings across EU countries. Service variety emerging as the leading predictor indicates a shift from standard intensity-based risk stratification to breadth-based measures that can capture the often multifaceted and complex contours of costly health episodes.

5. Discussion

This study represents a paradigm shift in healthcare utilization prediction, achieving unprecedented accuracy levels that fundamentally challenge existing approaches to risk stratification and care management. The extraordinary predictive performance observed across all three machine learning algorithms, with area under the curve values exceeding 0.99, establishes a new benchmark for healthcare prediction using administrative data. These results substantially exceed performance benchmarks established in the existing literature, where typical AUC values range from 0.65 to 0.85 for similar applications (Kansagara et al. 2011; Rose and McGuire 2017). Even the most sophisticated recent studies using advanced machine learning techniques have rarely exceeded AUC values of 0.85–0.90 (Tamang et al. 2017), making our improvement of 0.10–0.30 AUC units a quantum leap in predictive capability.
The rise in service diversity as a best predictor is probably the most important substantive conclusion, contradicting prevailing wisdom that healthcare expenditure is led by intense episodes within individual clinical domains. Service diversity had an odds ratio of 16.64 and outperformed traditional predictors by significant margins, hinting that coordination and integration issues related to multi-domain exposures are likely to be the single most important drivers of cost increases (McDonald et al. 2007). This conclusion is consistent with theories of care coordination complexity, whereby patients who receive services in several domains have exponentially rising communication, scheduling, and integration challenges (Coleman and Boult 2003).
One of the most surprising findings is the identification of dentistry services as the single greatest predictor, with odds ratios over 55. This is probably indicative of patterns of healthcare-seeking behavior and healthcare accessibility more than direct biological causality (Seymour et al. 2007). The active dental-seeking sample might be more engaged with healthcare in general, with predisposition of these conditions for treatment at an earlier stage. In private health insurance settings across EU countries like those represented in our dataset, dental service uptake might be indicative of general healthcare engagement in more than one service setting. The finding could instead reflect administrative or accessibility patterns of our particular insurance setting and less of a general biological relation, irrespective of the underlying mechanism. The strength of the association means that dental service activity could be used as an early warning signal for subsequent complex medical necessity, with the possibility of new screening and intervention approaches.
The consistency of exceptional performance across logistic regression, random forest, and support vector machine algorithms provides compelling evidence that these relationships are robust and not artifacts of specific modeling assumptions. This algorithmic convergence, combined with minimal overfitting evidence, strengthens confidence in the generalizability of findings across different analytical contexts (Rajkomar et al. 2018b).
The policy and practice implications are transformative. Sophisticated risk stratification programs, which were previously impossible to implement, can be put into place by private health insurance organizations with unprecedented precision, opening up the prospect of preventing cost escalation and enhancing outcomes. The finding of service diversity yields immediately applicable information for care management initiatives, proposing to identify patients who are consuming services across multiple domains before high utilization develops.
The machine learning execution developed under this study offers numerous immediate and transformative values for use across EU nation private health insurance organizations. The most immediate use is for risk stratification and sooner recognition where private insurance organizations can set up automatic system screenings for the discovery of patients with high potentials for becoming expensive utilizers before expensive episodes. With near perfection accuracy up to an area under the curve of above 0.99, private insurers can focus interventions for the approximate 20% of beneficiaries designated for 80% of the healthcare cost, practically revolutionizing the process for risk assessments under private insurance settings.
The service diversity measure makes it possible to move towards proactive management of care, wherein the care coordinator can notify patients beginning to draw on multiple service areas and trigger the process for early intervention. It becomes a significant movement away from traditional reactive crisis management towards a preventive approach to care coordination, wherein potential high-cost phases can be anticipated and managed before they happen. It makes possible the potential for special case management and planned coordinated care that can achieve the best outcome for patients and be cost effective.
From a business perspective, the predictive insights facilitate sophisticated resource planning and allocation strategies. Insurance organizations can maximize their provider networks, negotiate contracts for specific populations, and allocate case management resources based on empirical evidence rather than historic assumptions. The fact that the predictive models place dentistry services within a primary set of major predictors suggests unrecognized potential for coordinated care pathways spanning traditional service silos, potentially yielding more comprehensive and coordinated care models.
The tool’s predictive power translates directly into cost containment measures through the potential for spotting high-cost patterns upfront. The insurers can then implement targeted interventions such as disease management programs, care coordination programs, or wellness programs that can avert cost escalation before expensive episodes happen. The preemptive step forms a major enhancement on the traditional reactive cost containment procedures that focus on problems once issues have already incurred high expenses.
Several limitations warrant consideration. Our analysis is based on private health insurance data across EU countries, and generalization to other private insurance systems and public healthcare contexts requires validation. The binary classification approach may not capture the full complexity of healthcare cost prediction, and our analysis is fundamentally descriptive rather than causal, limiting inferences about effective interventions. Our analysis focuses on private health insurance data from EU countries where complementary public–private systems are the norm. The utilization patterns and predictive relationships observed may be influenced by this dual-coverage context, where private insurance represents only a portion of patients’ total healthcare utilization. Generalization to pure private insurance systems, single-payer public systems, or different public–private integration models requires validation. Additionally, we cannot assess how concurrent public system utilization might complement or substitute the private insurance services measured in our dataset. Additionally, our analysis cannot distinguish between domestic healthcare utilization and travel-related claims, where the European Health Insurance Card (EHIC) and private insurance interactions may create different utilization incentives and patterns.
This study consider only administrative claims and does not include clinical measures such as medical history, chronic conditions, comorbidities, disease intensity, or measures of current health status. While service use patterns have good predictive indications, the absence of clinical measures makes it impossible for us to determine if high use is the result of disease intensity hidden beneath the surface, health-seeking on the part of the consumer, or system factors. It is a principal limitation because clinical measures would provide additional predictive capacity and more actionable information on which to base the management of care.

6. Conclusions

The project has answered a central question regarding how to best forecast healthcare utilization through a landmark breakthrough by proving that very accurate patient detection of high-cost patients is possible through multi-service analytical frameworks and comprehensive administrative data. The unprecedented predictive accuracy, reflected by area under the curve measurements greater than 0.99 across three different algorithms of supervised learning, constitutes a quantum leap from current benchmarks and sets new standards for possible achievements through risk stratification in the healthcare sector.
The discovery that service diversity serves as the dominant predictor fundamentally challenges conventional approaches to healthcare risk assessments. Rather than focusing on utilization intensity within specific clinical domains, our findings demonstrate that the breadth of healthcare consumption across multiple service categories provides superior predictive power. This paradigm shift from intensity-based to diversity-based risk assessments has immediate implications for care management strategies, suggesting that the early identification of patients beginning to consume services across multiple domains may prevent progression to high-cost episodes more effectively than traditional approaches.
The new evidence supporting dentistry services as the single strongest predictor, with odds ratios above 55, indicates previously unknown relationships between oral health involvement and larger healthcare complexity. This result creates new directions for combined screening and intervention methods that utilize dental services as an early warning system for developing medical complexity. The similarity of outstanding performance across radically different algorithmic methods offers persuasive evidence of these relationships’ generalizability and robustness. Healthcare organizations can adopt advanced predictive systems assuredly knowing that numerous different analytical frameworks will produce similar results, facilitating adaptive implementation tactics based upon individual technical and organizational conditions.
The practical implications extend far beyond academic interest to immediate transformation opportunities for healthcare delivery and financing. Private health insurance organizations can achieve unprecedented accuracy in risk stratification, enabling proactive care management interventions that may simultaneously improve patient outcomes and reduce system costs. The identification of service utilization patterns as powerful predictive signals suggests that administrative data contains far richer information than previously recognized, potentially revolutionizing approaches to population health management and value-based care.
Future research priorities include validation across diverse private insurance systems and public healthcare contexts, causal inference investigations to understand the mechanisms underlying these relationships, and randomized controlled trials of prediction-guided interventions. The foundation established by this work positions healthcare prediction research to move beyond incremental improvements toward transformative advances that can reshape how healthcare systems identify, monitor, and respond to emerging patient complexity. Future research should investigate also whether these predictive relationships hold across different insurance system configurations, including pure private systems, integrated public–private models, and single-payer contexts. Understanding how dual coverage affects utilization patterns and prediction accuracy is crucial for the broader application of these methods.
This study demonstrates that the long-standing goal of accurate healthcare risk prediction is not only achievable but can exceed the most optimistic performance expectations. The multi-service utilization framework developed here provides a road map for private health insurance organizations seeking to harness the predictive power of administrative data for improving population health outcomes while optimizing resource allocation. The era of near-perfect healthcare prediction has arrived, and its implications for private health insurance delivery, policy, and patient outcomes are profound and far-reaching.

Funding

This research received no external funding.

Data Availability Statement

Dataset available on request from the author.

Conflicts of Interest

The author declares no conflicts of interest.

References

  1. Andersen, Ronald M. 1995. Revisiting the behavioral model and access to medical care: Does it matter? Journal of Health and Social Behavior 6: 1–10. [Google Scholar] [CrossRef]
  2. Avati, Anand, Kenneth Jung, Stephanie Harman, Lance Downing, Andrew Ng, and Nigam H. Shah. 2018. Improving palliative care with deep learning. BMC Medical Informatics and Decision Making 18: 1–9. [Google Scholar] [CrossRef] [PubMed]
  3. Beam, Andrew L., and Isaac S. Kohane. 2018. Big data and machine learning in health care. JAMA 319: 1317–318. [Google Scholar] [CrossRef] [PubMed]
  4. Billings, John, Ian Blunt, Adam Steventon, Theo Georghiou, Geraint Lewis, and Martin Bardsley. 2016. Case finding for patients at risk of readmission to hospital: Development of algorithm to identify high risk patients. BMJ 353: i2547. [Google Scholar] [CrossRef]
  5. Breiman, Leo. 2001. Random forests. Machine Learning 45: 5–32. [Google Scholar] [CrossRef]
  6. Busse, Reinhard, and Miriam Blümel. 2013. Health Care Systems in Transition: Germany. Brussels: European Observatory on Health Systems and Policies. [Google Scholar]
  7. Chen, Tianqi, and Carlos Guestrin. 2016. Xgboost: A scalable tree boosting system. Paper Presented at the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, August 13–17; pp. 785–94. [Google Scholar]
  8. Christodoulou, Evangelia, Jie Ma, Gary S. Collins, Ewout W. Steyerberg, Jan Y. Verbakel, and Ben Van Calster. 2019. A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models. Journal of Clinical Epidemiology 110: 12–22. [Google Scholar] [CrossRef]
  9. Cohen, Steven B., and William Yu. 2012. The concentration of health care expenditures and related expenses for costly medical conditions. Healthcare Financial Management 66: 90–95. [Google Scholar]
  10. Coleman, Eric A., and Chad Boult. 2003. Falling through the cracks: Challenges and opportunities for improving transitional care for persons with continuous complex care needs. Journal of the American Geriatrics Society 51: 549–55. [Google Scholar] [CrossRef]
  11. Cortes, Corinna, and Vladimir Vapnik. 1995. Support-vector networks. Machine Learning 20: 273–97. [Google Scholar] [CrossRef]
  12. Duncan, Ian, Michael Loginov, and Mike Ludkovski. 2011. Defining and characterizing high-cost utilizers of healthcare services: A retrospective cohort study. BMC Health Services Research 11: 1–9. [Google Scholar]
  13. European Parliament. 2016. Regulation (eu) 2016/679 of the european parliament and of the council. Official Journal of the European Union L119: 1–88. [Google Scholar]
  14. Garcia, Roberto, Sofia Fernandez, Diego Morales, and Carmen Santos. 2022. Deep learning for predicting healthcare costs: A systematic evaluation. Computers in Biology and Medicine 148: 105924. [Google Scholar]
  15. Geissler, Alexander, Wilm Quentin, Dietmar Scheller-Kreinsen, and Reinhard Busse. 2011. Diagnosis-related groups in europe: Moving towards transparency, efficiency and quality in hospitals. European Observatory on Health Systems and Policies 346: 1–347. [Google Scholar]
  16. Hastie, Trevor, Robert Tibshirani, and Jerome Friedman. 2009. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Berlin/Heidelberg: Springer Science & Business Media. [Google Scholar]
  17. Hayes, Suzanne L., Catherine A. Salzberg, Douglas McCarthy, David C. Radley, Melinda K. Abrams, Taroon Shah, and Gerard F. Anderson. 2016. High-cost patients: Hot-spotters don’t explain the half of it. Journal of General Internal Medicine 31: 374–81. [Google Scholar]
  18. Hong, Wai Shan, Adrian D. Haimovich, and R. Andrew Taylor. 2013. Predicting hospital admission at emergency department triage using machine learning. PLoS ONE 8: e82393. [Google Scholar] [CrossRef]
  19. Iezzoni, Lisa I. 2003. Risk Adjustment for Measuring Health Care Outcomes. Chicago: Health Administration Press. [Google Scholar]
  20. Kansagara, Devan, Honora Englander, Amanda Salanitro, David Kagen, Cecelia Theobald, Michele Freeman, and Sunil Kripalani. 2011. Risk prediction models for hospital readmission: A systematic review. JAMA 306: 1688–698. [Google Scholar] [CrossRef]
  21. Kindig, David, and Greg Stoddart. 2003. Understanding population health terminology. The Milbank Quarterly 81: 557–79. [Google Scholar] [CrossRef]
  22. Kohavi, Ron. 1995. A study of cross-validation and bootstrap for accuracy estimation and model selection. International Joint Conference on Artificial Intelligence 14: 1137–145. [Google Scholar]
  23. Kumar, Santosh, Pradeep Reddy, Anjali Sharma, and Rahul Gupta. 2021. Healthcare analytics using machine learning: Current trends and future perspectives. Computer Methods and Programs in Biomedicine 8: 106263. [Google Scholar]
  24. Martinez, Elena, James Thompson, Patricia Anderson, and Kevin Lee. 2023. Temporal patterns in healthcare utilization: A machine learning approach. Journal of Healthcare Engineering 2023: 5847392. [Google Scholar]
  25. McDonald, Kathryn M., Vandana Sundaram, Dena M. Bravata, Robyn Lewis, Nancy Lin, Sally A. Kraft, Martha McKinnon, Helen Paguntalan, and Douglas K. Owens. 2007. Care Coordination Atlas Version 4. Rockville: Agency for Healthcare Research and Quality. [Google Scholar]
  26. Miotto, Riccardo, Fei Wang, Shuang Wang, Xiaoqian Jiang, and Joel T. Dudley. 2018. Deep learning for healthcare: Review, opportunities and challenges. Briefings in Bioinformatics 19: 1236–246. [Google Scholar] [CrossRef] [PubMed]
  27. Mossialos, Elias, Govin Permanand, Rita Baeten, and Tamara K. Hervey. 2017. Health Systems Governance in Europe: The Role of European Union Law and Policy. Cambridge: Cambridge University Press. [Google Scholar]
  28. Moynihan, Ray, Sharon Sanders, Zoe A. Michaleff, Anna M. Scott, Justin Clark, Eliza J. To, Mark Jones, Elise Kitchener, Matthew Fox, Magnolia Johansson, and et al. 2021. Impact of COVID-19 pandemic on utilisation of healthcare services: A systematic review. BMJ Open 11: e045343. [Google Scholar] [CrossRef] [PubMed]
  29. OECD. 2021. Health at a glance 2021: Oecd indicators. OECD Health Statistics. [Google Scholar] [CrossRef]
  30. Olaoye, Godwin. 2025. Comparative study of machine learning models for predicting health insurance costs. SSRN Electronic Journal. [Google Scholar] [CrossRef]
  31. Orhan, Fatih, and Mehmet Nurullah Kurutkan. 2025. Predicting total healthcare demand using machine learning. BMC Health Services Research 25: 12502. [Google Scholar] [CrossRef]
  32. Patel, Nisha, Rajesh Singh, Amit Kumar, and Priya Sharma. 2021. A systematic review of machine learning techniques for healthcare cost prediction. Health Informatics Journal 27: 1460458221998065. [Google Scholar]
  33. Rajkomar, Alvin, Eyal Oren, Kai Chen, Andrew M. Dai, Nissan Hajaj, Michaela Hardt, Peter J. Liu, Xiaobing Liu, Jake Marcus, Mimi Sun, and et al. 2018a. Scalable and accurate deep learning with electronic health records. NPJ Digital Medicine 1: 18. [Google Scholar] [CrossRef]
  34. Rajkomar, Alvin, Jeffrey Dean, and Isaac Kohane. 2018b. Machine learning in medicine. New England Journal of Medicine 380: 1347–358. [Google Scholar] [CrossRef]
  35. Rose, Sherri, Alan M. Zaslavsky, and J. Michael McWilliams. 2017. Evaluation of a claims-based measure as an indicator of hospital quality. Medical Care Research and Review 74: 560–78. [Google Scholar]
  36. Rose, Sherri, and Thomas G. McGuire. 2017. Predicting healthcare costs at the patient level: A systematic review. Medical Care 55: 313–23. [Google Scholar]
  37. Seymour, Garry J., P. John Ford, Mark P. Cullinan, Steven Leishman, and Kazuhisa Yamazaki. 2007. Relationship between periodontal infections and systemic disease. Clinical Microbiology and Infection 13: 3–10. [Google Scholar] [CrossRef] [PubMed]
  38. Shickel, Benjamin, Patrick James Tighe, Azra Bihorac, and Parisa Rashidi. 2017. Deep ehr: A survey of recent advances in deep learning techniques for electronic health record (ehr) analysis. IEEE Journal of Biomedical and Health Informatics 22: 1589–604. [Google Scholar] [CrossRef] [PubMed]
  39. Steyerberg, Ewout W. 2019. Clinical Prediction Models: A Practical Approach to Development, Validation, and Updating. Berlin/Heidelberg: Springer. [Google Scholar]
  40. Stone, Mervyn. 1974. Cross-validatory choice and assessment of statistical predictions. Journal of the Royal Statistical Society: Series B 36: 111–33. [Google Scholar] [CrossRef]
  41. Tamang, Suzanne, Arnold Milstein, Henrik Toft Sørensen, Lars Pedersen, Lester Mackey, James R. Betterton, and Tina Hernandez-Boussard. 2017. Predicting patient costs at two urban emergency departments. PLoS ONE 12: e0172768. [Google Scholar]
  42. Thomson, Sarah, Josep Figueras, Tamás Evetovits, Matthew Jowett, Philipa Mladovsky, Anna Maresso, Jonathan Cylus, Marina Karanikolos, and Hans Kluge. 2013. Economic Crisis, Health Systems and Health in Europe: Impact and Implications for Policy. Brussels: European Observatory on Health Systems and Policies, pp. 1–279. [Google Scholar]
  43. Tibshirani, Robert. 1996. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B 58: 267–88. [Google Scholar] [CrossRef]
  44. Wong, Adrian, Xiaohui Liu, Kelvin Chan, and Wei Zhang. 2020. Feature selection and engineering for healthcare prediction models: A comprehensive review. IEEE Reviews in Biomedical Engineering 13: 98–112. [Google Scholar]
  45. Xu, Yang, Xiaohui Liu, Xudong Cao, Chuiping Huang, Enfu Liu, Sen Qian, Xinfeng Liu, Yingdong Wu, Fangxin Dong, Chenwei Qiu, and et al. 2022. Machine learning algorithms for predicting healthcare costs: A systematic review. Journal of Biomedical Informatics 129: 104057. [Google Scholar]
  46. Yang, Christopher C. 2022. Explainable artificial intelligence for predictive modeling in healthcare. Journal of Healthcare Informatics Research 6: 228–39. [Google Scholar] [CrossRef]
  47. Zou, Hui, and Trevor Hastie. 2005. Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B 67: 301–20. [Google Scholar] [CrossRef]
Figure 1. Distribution of total healthcare costs.
Figure 1. Distribution of total healthcare costs.
Risks 13 00133 g001
Figure 2. Service utilization correlation matrix.
Figure 2. Service utilization correlation matrix.
Risks 13 00133 g002
Figure 3. ROC curve comparison.
Figure 3. ROC curve comparison.
Risks 13 00133 g003
Figure 4. Feature importance comparison.
Figure 4. Feature importance comparison.
Risks 13 00133 g004
Figure 5. Confusion matrix comparison.
Figure 5. Confusion matrix comparison.
Risks 13 00133 g005
Table 1. Variable descriptions and statistical classifications.
Table 1. Variable descriptions and statistical classifications.
Variable NameDescriptionCode AbbreviationStatistical Type
relationRelationship to contract ownerrelationCategorical
genderPatient gendergenderCategorical
policy_yearsYears covered by policypolicy_yrsContinuous
age_at_inceptionAge when policy startedage_inceptionContinuous
age_groupAge group categorizationage_grpCategorical
num_analysisNumber of analysis servicesnum_analysisContinuous
num_dentistryNumber of dentistry servicesnum_dentistryContinuous
num_diagnosticsNumber of diagnostic servicesnum_diagnosticsContinuous
num_endoscopyNumber of endoscopy servicesnum_endoscopyContinuous
num_hospitalizationsNumber of hospitalizationsnum_hospContinuous
num_mammographyNumber of mammography servicesnum_mammographyContinuous
num_operationsNumber of operationsnum_operationsContinuous
num_visitsNumber of visitsnum_visitsContinuous
service_diversityCount of different service types usedservice_divContinuous
high_utilizationHigh healthcare utilization indicatorhigh_utilBinary
Table 2. Descriptive statistics by utilization status.
Table 2. Descriptive statistics by utilization status.
VariableVariable TypeLow Utilization (n = 140,372)High Utilization (n = 35,660)
Mean (SD)MedianMean (SD)Median
Policy YearsContinuous2.48 (1.9)1.5033.83 (1.8)4.0
Age at InceptionContinuous37.7 (11.6)38.34336.4 (11.3)37.212
Analysis ServicesContinuous0.01 (0.1)00.99 (1.6)0
Dentistry ServicesContinuous0.01 (0.1)00.77 (0.8)1
Diagnostic ServicesContinuous0.005 (0.1)00.29 (0.6)0
Endoscopy ServicesContinuous0.000 (0.02)00.04 (0.2)0
HospitalizationsContinuous0.000 (0.01)00.14 (1.0)0
Mammography ServicesContinuous0.005 (0.1)00.46 (0.9)0
OperationsContinuous0.000 (0.0)00.009 (0.2)0
Medical VisitsContinuous0.02 (0.2)01.50 (1.8)1
Service DiversityContinuous0.05 (0.3)02.27 (1.1)2
Gender (Female)Categorical45,976 (32.8%)19,660 (55.1%)
Relation (Contract Owner)Categorical122,867 (87.5%)27,914 (78.3%)
Age Group (30–50)Categorical89,044 (63.4%)24,007 (67.3%)
All differences statistically significant at p < 0.001 .
Table 3. Model performance comparison on the test set.
Table 3. Model performance comparison on the test set.
ModelAccuracySensitivitySpecificityPrecisionF1-ScoreAUC
Logistic Regression0.9780.9340.9870.9460.9400.996
Random Forest0.9790.9600.9810.9280.9440.997
Support Vector Machine0.9770.9220.9890.9480.9350.995
Table 4. Logistic regression model results.
Table 4. Logistic regression model results.
VariableCoefficientOdds Ratio95% CIp-Value
Dentistry Services4.0255.46(25.24–121.84)<0.001
Service Diversity2.8116.64(9.59–28.88)<0.001
Operations2.5512.84(7.78–21.17)<0.001
Hospitalizations2.198.97(5.83–13.78)<0.001
Medical Visits1.584.87(3.57–6.65)<0.001
Mammography Services1.133.08(2.47–3.84)<0.001
Analysis Services1.002.71(2.23–3.29)<0.001
Endoscopy Services0.832.29(1.95–2.70)<0.001
Diagnostic Services0.411.50(1.39–1.63)<0.001
Gender (Male)−0.320.72(0.68–0.77)<0.001
Relation (Son)−0.280.75(0.71–0.80)<0.001
Policy Years0.051.05(1.04–1.06)<0.05
Table 5. Random forest variable importance.
Table 5. Random forest variable importance.
VariableImportance ScoreRankPercent Contribution
Service Diversity100.0118.8
Dentistry Services88.3216.6
Hospitalizations70.1313.2
Operations63.2411.9
Medical Visits58.7511.0
Mammography Services45.968.6
Analysis Services42.177.9
Endoscopy Services35.886.7
Diagnostic Services28.995.4
Policy Years15.2102.9
Table 6. Support vector machine model results.
Table 6. Support vector machine model results.
ParameterValueDescription
Kernel TypeRadial Basis FunctionNon-linear kernel for complex boundaries
Cost (C)1.0Regularization parameter
Sigma ( γ )0.1RBF kernel bandwidth parameter
Training Sample Size20,000Subsample for computational efficiency
Cross-Validation ROC0.9955-fold CV performance
PreprocessingCenter, Scale, NZVFeature standardization and selection
Table 7. Cross-validation vs. test set performances.
Table 7. Cross-validation vs. test set performances.
ModelCV ROCTest AUCDifferenceGeneralization
Logistic Regression0.9970.9960.001Excellent
Random Forest0.9960.997−0.001Excellent
Support Vector Machine0.9950.9950.000Excellent
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Seyam, E.A. Predicting High-Cost Healthcare Utilization Using Machine Learning: A Multi-Service Risk Stratification Analysis in EU-Based Private Group Health Insurance. Risks 2025, 13, 133. https://doi.org/10.3390/risks13070133

AMA Style

Seyam EA. Predicting High-Cost Healthcare Utilization Using Machine Learning: A Multi-Service Risk Stratification Analysis in EU-Based Private Group Health Insurance. Risks. 2025; 13(7):133. https://doi.org/10.3390/risks13070133

Chicago/Turabian Style

Seyam, Eslam Abdelhakim. 2025. "Predicting High-Cost Healthcare Utilization Using Machine Learning: A Multi-Service Risk Stratification Analysis in EU-Based Private Group Health Insurance" Risks 13, no. 7: 133. https://doi.org/10.3390/risks13070133

APA Style

Seyam, E. A. (2025). Predicting High-Cost Healthcare Utilization Using Machine Learning: A Multi-Service Risk Stratification Analysis in EU-Based Private Group Health Insurance. Risks, 13(7), 133. https://doi.org/10.3390/risks13070133

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop