Cloud Adoption in the Digital Era: An Interpretable Machine Learning Analysis of National Readiness and Structural Disparities Across the EU

Tudor, Cristiana; Florescu, Margareta; Polychronidou, Persefoni; Stamatiou, Pavlos; Vlachos, Vasileios; Kasabali, Konstadina

doi:10.3390/app15148019

Open AccessArticle

Cloud Adoption in the Digital Era: An Interpretable Machine Learning Analysis of National Readiness and Structural Disparities Across the EU

by

Cristiana Tudor

^1,*

,

Margareta Florescu

²,

Persefoni Polychronidou

³

,

Pavlos Stamatiou

³

,

Vasileios Vlachos

³

and

Konstadina Kasabali

³

¹

Faculty of International Business and Economics, Bucharest University of Economic Studies, Romana Square 6, 010374 Bucharest, Romania

²

Faculty of Administration and Public Management, Bucharest University of Economic Studies, Romana Square 6, 010374 Bucharest, Romania

³

Department of Economics, International Hellenic University, Terma Magnisias Street, 62124 Serres, Greece

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(14), 8019; https://doi.org/10.3390/app15148019

Submission received: 21 May 2025 / Revised: 26 June 2025 / Accepted: 11 July 2025 / Published: 18 July 2025

(This article belongs to the Special Issue Advanced Technologies Applied in Digital Media Era)

Download

Browse Figures

Review Reports Versions Notes

Abstract

Featured Application

This study offers a practical tool for policymakers and digital strategists to assess and benchmark national cloud readiness. By using interpretable machine learning, it pinpoints the main infrastructure and socioeconomic factors influencing cloud adoption, allowing for focused interventions. The approach facilitates tracking progress toward digital transformation objectives like the EU Digital Decade and can be adapted for use in other areas. Furthermore, ICT development agencies, international organizations working on digital capacity building, and ministries of digital affairs can all benefit from the clustering analysis’s practical insights for customizing public policies and investment strategies based on a nation’s digital maturity profile.

Abstract

As digital transformation accelerates across Europe, cloud computing plays an increasingly central role in modernizing public services and private enterprises. Yet adoption rates vary markedly among EU member states, reflecting deeper structural differences in digital capacity. This study employs explainable machine learning to uncover the drivers of national cloud adoption across 27 EU countries using harmonized panel datasets spanning 2014–2021 and 2014–2024. A methodological pipeline combining Random Forests (RF), XGBoost, Support Vector Machines (SVM), and Elastic Net regression is implemented, with model tuning conducted via nested cross-validation. Among individual models, Elastic Net and SVM delivered superior predictive performance, while a stacked ensemble achieved the best overall accuracy (MAE = 0.214, R² = 0.948). The most interpretable model, a standardized RF with country fixed effects, attained MAE = 0.321, and R² = 0.864, making it well-suited for policy analysis. Variable importance analysis reveals that the density of ICT specialists is the strongest predictor of adoption, followed by broadband access and higher education. Fixed-effect modeling confirms significant national heterogeneity, with countries like Finland and Luxembourg consistently leading adoption, while Bulgaria and Romania exhibit structural barriers. Partial dependence and SHAP analyses reveal nonlinear complementarities between digital skills and infrastructure. A hierarchical clustering of countries reveals three distinct digital maturity profiles, offering tailored policy pathways. These results directly support the EU Digital Decade’s strategic targets and provide actionable insights for advancing inclusive and resilient digital transformation across the Union.

Keywords:

cloud computing adoption; digital transformation; machine learning; explainable AI (XAI); EU digital policy; ICT infrastructure

1. Introduction

Cloud computing is emerging as a key enabler in the digital transformation of organizations and governments, significantly contributing to increased productivity, cost reduction, and improved efficiency [1,2,3]. Through cloud services, organizations such as local governments can focus on their strategies, as the scalability and reliability of their systems are enhanced [4]. Strategies that incorporate multi-scale feature extraction and dual attention mechanisms effectively capture informative patterns at different levels of detail, enhancing decision-making and system responsiveness [5]. The cloud is now a foundation of digital transformation and is closely linked to service-oriented architecture. It integrates storage, applications, and business processes into a dynamic, reusable environment. In Digital Open Government (DOG), the use of multi-layered cloud structures reduces infrastructure needs and software costs [6]. Despite the benefits, cloud adoption is not uniform internationally. Factors such as the quality of the legal framework and broadband penetration influence its adoption. In particular, countries dependent on export-oriented businesses find it difficult to move to the cloud due to past investments in traditional technologies [7].

Cloud readiness, therefore, is not only about technological infrastructure but also about developing unique skills and knowledge. Multinational companies that have strategically invested in the cloud have recorded benefits in terms of operations and competitiveness, strengthening their sustainable development and external networks [8]. Additionally, the adoption of the cloud is also associated with quantifiable economic benefits, according to new empirical data, particularly when paired with enabling infrastructure like high-speed broadband. Nonetheless, the effects continue to vary among industries and nations, indicating the necessity of designing policies according to the circumstances [9].

Prior research has mostly concentrated on firm-level cloud computing adoption, with little investigation into the macro-level factors that influence cloud computing adoption [7,10,11,12]. Senyo et al. [13] state (in their review of the cloud computing literature) that research on cloud computing at the macro level (national level) will create more awareness and support towards favorable policies for cloud computing. The European Commission aims to increase the access of European businesses and public authorities to cloud infrastructures and services in order to achieve the objective of the EUs Digital Decade, where 75% of European businesses should use cloud-edge technologies for their activities by 2030.

Driven by this critical gap in the literature and its growing policy relevance, this study investigates the macro-structural determinants of cloud computing adoption across 27 EU countries. Existing research predominantly emphasizes firm- or sector-level factors, often using traditional econometric models constrained by linearity and limited interaction handling. In contrast, national-level adoption dynamics likely reflect more intricate, nonlinear relationships involving digital infrastructure, human capital, and socioeconomic context.

To capture this complexity, we propose an integrated methodology that combines interpretable machine learning (XAI) with panel econometric validation. Using harmonized Eurostat indicators, we build a dual-panel dataset covering 2014–2021 and 2014–2024. Random Forests [14], XGBoost [15], and Support Vector Machines are deployed to model adoption outcomes, while SHAP and ICE visualizations enable transparent interpretation of predictor influence [16]. This flexible architecture overcomes the functional limitations of fixed-effects and linear models [17,18], capturing both cross-country heterogeneity and temporal evolution.

Furthermore, we estimate a dynamic panel system GMM model to benchmark causal inferences, reinforcing the centrality of digital skills while highlighting methodological complementarity. Hierarchical clustering then segments EU countries into distinct digital maturity profiles, providing actionable input for differentiated policy design.

Together, these contributions offer a data-driven, policy-relevant framework aligned with the EUs Digital Decade priorities, advancing both explanatory insight and strategic guidance for accelerating cloud adoption across member states.

The most significant determinants of cloud adoption, according to empirical findings, are ICT professionals and broadband infrastructure, confirming the critical roles that connectivity and digital skills play. Different national profiles of digital maturity are also shown by clustering analysis, which has obvious ramifications for differentiating policy formulation throughout the EU.

The remainder of the paper is structured as follows. Section 2 reviews the relevant literature on cloud adoption and digital readiness. The data sources, variable definitions, and methodological approach, which includes clustering techniques and machine learning models, are presented in Section 3. The primary empirical findings, such as model performance, variable relevance, and country clustering, are presented in Section 4. The findings are addressed in Section 5, which also places the results in the larger framework of EU digital strategy. Section 6 concludes with limitations and directions for future research.

2. Literature Review

In this section, previous studies on the macro-level factors influencing cloud computing and digital transformation are summarized. The review is organized around five main themes: (i) The macroeconomic factors that influence cloud adoption; (ii) the importance of human capital and digital skills; (iii) infrastructure and regional preparedness; (iv) new digital technologies; and (v) the formulation of hypotheses based on these empirical findings.

2.1. Macro-Level Predictors of Cloud Adoption

Cloud computing has emerged as a key technology over the past decade, as it allows for the efficient management of computing resources through subscription models. Despite the advantages, its adoption is accompanied by challenges, mainly in terms of ecosystem readiness at the country level. The study by Tripathy & Jyotishi [10] examines the BSA Cloud Scorecard in 24 countries, recording that factors such as GDP per capita, business environment, research and development, and governance have significant and positive influences on cloud computing.

Moreover, factors such as business environment, research and development, and governance also have a significant and positive impact on the growth factors that affect the ecosystem’s readiness for cloud computing. In another macro-level study aiming to explain country-level predictors of cloud adoption, Vu et al. [7] find that institutional quality, broadband and internet penetration (no. of broadband subscriptions and internet users), and trade openness are significant predictors. As a result, this study advances our knowledge of the factors that encourage the uptake of cloud computing (CC), one of the most revolutionary technologies. The study’s conclusions suggest that governments are essential in encouraging the use of CC.

Motivated by the fact that developed economies have adopted cloud-based services, while emerging economies are still lagging, Tripathy et al. [11] investigate potential influential factors. Their findings suggest that the political-regulatory-business environment, investing in research and development, tertiary education, and knowledge workers significantly impact cloud computing readiness. Karamujic [12] tests the proposition that formal and informal national institutions (such as national culture) affect cloud computing adoption and finds that power distance, uncertainty avoidance, trade union strength, and government effectiveness are driving factors. According to some scholars (such as Straub [19]), formal institutions are created in order to legitimize informal institutions. As a result, some indirect links between the factors that have not been investigated yet may exist. This may potentially be a subject for more research.

In addition to traditional economic indicators, broader structural investments, such as innovation systems and sustainable infrastructure, are increasingly recognized as key to national readiness for digital transformation. Girlovan et al. [20] underscore how variables reflecting innovation and energy dynamics are essential not only for environmental sustainability but also as proxies for long-term systemic adaptability in EU member states. These insights align with the premise that macro-level capability-building underpins readiness for digital and cloud-based transitions.

2.2. Human Capital and Digital Skills in Digital Transformation

A study by Tuguskina et al. [21] on the development of the digital economy in Russia, as envisaged in the “Digital Economy of the Russian Federation” program, demonstrates that we cannot have development in the digital economy without strengthening human capital. Preparing specialized professionals with modern digital skills is crucial for increasing competitiveness, quality of life, and national sovereignty. Technologies related to digitalization penetrate everyday life and require broad participation of the population, especially through e-government. Therefore, digital skills are necessary for communication between citizens and public administration [22]. In the modern digital environment, human capital is the most valuable resource for the development of high-tech industries. The impact of digitalization on human capital development was analyzed in 82 Russian regions, confirming that investing in digital infrastructure, supporting higher education, and reducing digital inequality are critical prerequisites for strengthening human capital [23].

2.3. Digital Readiness, Macroeconomic Outcomes, and Regional Gaps

The study by Tudose et al. [24] analyze the relationship between digital transformation and macroeconomic performance, using the Network Readiness Index (NRI). The findings show a positive correlation between digital maturity and GDP per capita. However, the research’s sole use of NRI is a significant restriction. The knowledge framework might benefit from comparative evaluations of the effects of additional digital transformation-related measures. The case of Romania, according to Apostol [25], highlights challenges in the digital platform economy, such as low technological adoption and insufficient digital literacy. The study suggests strengthening infrastructure and improving the business ecosystem so that the country can actively participate in the industrial-digital revolution.

2.4. Infrastructure, Connectivity, and Technological Convergence

The adoption of the commercial internet in the 1990s radically restructured the digital infrastructure, paving the way for the development and consolidation of digital services such as the sharing economy, social media, e-commerce, and online advertising services. These digital services continue to grow and become more important in GDP. Indicatively, in 2017, e-retail in the US reached $545 billion, a 65% increase since 2012, while online advertising exceeded $105 billion, an increase of 250% [26]. Information and communication technologies (ICTs) continue to transform industries such as healthcare and energy through applications such as remote diagnostics and smart grids. Therefore, as businesses turn to digital transformation, the integration of IT and ICT infrastructure is required [27].

The European Union is investing in the deployment of high-speed broadband Internet with the aim of boosting growth and employment, especially in rural areas. However, the socio-economic benefits depend on the adoption of services and vary depending on the region and the specialization of the workforce [28]. At the same time, the adoption of 5G and the expansion of spectrum bring significant improvements in connectivity. In the Netherlands, for example, the integration of 5G spectrum can increase capacity per capita by 40% compared to LTE [29]. A 5G private network (5G-NPN) allows enterprises to deploy dedicated infrastructures with high reliability and speed, transforming production and adapting services to the increasing demands of the digital age [30].

2.5. Emerging Technologies and Organizational Readiness

The selection and implementation of digital technologies in digital transformation is a subject of intense study, especially due to the increasing importance of Artificial Intelligence (AI). AI enhances transformation but also introduces challenges. Holmström [31] proposes a framework for assessing organizational readiness for AI, focusing on four dimensions: technologies, activities, boundaries, and goals, demonstrating that this framework can facilitate the analysis of both the current socio-technical state of an organization’s AI and its prospects for adding value, as AI is expected to play a significant role in digital transformation. Shonubi’s [32] study shows that the convergence of organizational readiness with emerging technologies such as IoT and Information Technology 4.0 leads to technological progress provided there is support from management, governments, and suppliers. At the same time, Aftab et al. [33] emphasize the importance of digital leadership for business performance, showing that AI readiness strengthens the connection between leadership capabilities and innovation. The findings extend the leadership, AI, and big data literature and describe how their interaction might help improve economic and environmental outcomes in the digital era. Finally, Alfadhli et al. [34] examine the AI readiness of government institutions, focusing on digital transformation and data management, developing an assessment framework based on TRL and SAW, providing valuable guidance for the strategic adoption of AI in the public sector, highlighting the importance of a comprehensive approach to the adoption of artificial intelligence, and taking into account strategic alignment, technological capabilities, skills development, and resource allocation.

2.6. Hypotheses Development

Five hypotheses that reflect the macro-level factors influencing cloud computing adoption in EU member states are put out in this study based on the examined literature. First, a number of studies highlight how education and human capital support digital transformation. The ability to adopt and use cutting-edge technology, such as cloud computing, is more likely in nations where a larger proportion of the population has completed postsecondary education [11,21]. Thus,

H1:

Cloud adoption rates are higher in nations with higher university education levels.

Second, there is a constant correlation between increased digital investment and macroeconomic strength. Cloud adoption is facilitated by a larger GDP per capita, which usually indicates both more budgetary space and a bigger demand for digital services [10,24]. Thus,

H2:

Higher GDP per capita is positively related to cloud adoption.

Third, digital transformation relies heavily on the availability of skilled personnel. The presence of ICT specialists within the workforce improves an economy’s technical capacity to adopt, maintain, and innovate in cloud-based environments [8,23]. Therefore,

H3:

A higher share of ICT specialists leads to cloud computing adoption.

Fourth, efficient use of cloud requires a strong digital infrastructure, especially broadband access. According to empirical data, nations with higher rates of broadband adoption are better equipped to implement and grow cloud services [7,28]. Thus,

H4:

Cloud adoption is facilitated by increased broadband penetration.

Lastly, cloud preparedness may be indirectly impacted by more general macroeconomic factors like unemployment. High unemployment rates have the potential to discourage investment and impede efforts at digital transformation, especially in the public and small business sectors where resources are limited [25]. In this case,

H5:

Cloud adoption is moderated by unemployment rates, which may impede digital investments.

3. Materials and Methods

3.1. Data Preparation

3.1.1. Data Sources

This study utilizes several datasets collecting macroeconomic, educational, technological, and digital infrastructure indicators for European Union (EU) countries. Three crucial aspects of national capacity—human capital; economic development; and digital infrastructure—are reflected in the variables chosen; which are based on the literature on cloud readiness and digital transformation [9,35,36].

The dependent variable, cloud computing adoption, is a direct proxy for digital transformation uptake by firms, particularly those with more than ten employees [37]. It serves as an outcome indicator of strategic ICT integration [38].

The independent variables were chosen to reflect enabling factors that are well acknowledged in both theoretical and empirical research on the diffusion of technology. The working-age population’s educational attainment and digital literacy are reflected in higher education levels, and these factors support the ability to absorb sophisticated digital tools like cloud platforms [39]. Prior research has confirmed a strong correlation between GDP per capita and national digital maturity, which is a measure of economic capacity and investment potential [40]. Higher unemployment rates can stifle both public and private investments in digital infrastructure and innovation, making them a structural restriction [41]. The implementation and upkeep of cloud services in both the public and private sectors depend heavily on the specialized digital workforce capability that ICT specialists represent [42]. For cloud computing and digital services in general to be technically feasible, broadband access is necessary. Even digitally savvy companies could find it difficult to fully utilize cloud capabilities without strong infrastructure [43].

All datasets are retrieved from Eurostat, ensuring international comparability and temporal consistency. The panel covers the 27 EU member states, excluding aggregates such as EU27 or EA19. Specifically, the sample includes Austria, Belgium, Bulgaria, Croatia, Cyprus, the Czech Republic, Denmark, Estonia, Finland, France, Germany, Greece, Hungary, Ireland, Italy, Latvia, Lithuania, Luxembourg, Malta, the Netherlands, Poland, Portugal, Romania, Slovakia, Slovenia, Spain, and Sweden.

In order to handle the availability of broadband data, two analytical panels were built such that (i) broadband access is a predictor in Panel A (2014–2021). (ii) Panel B (2014–2024) permits longer longitudinal analysis but does not include broadband.

This dual-panel approach allows for robustness checks across longer time horizons as well as the evaluation of short-term consequences of digital infrastructure.

3.1.2. Variable Construction and Data Processing

Each variable was extracted, filtered, and transformed to ensure conceptual alignment, statistical consistency, and comparability across EU countries and years. All data originate from Eurostat and follow standard EU statistical codes and classifications. Table 1 summarizes their definitions, measurement units, and sources.

All variables are available annually from 2014 to 2024, except for Broadband Access, which is limited to the period until 2021. Due to this data limitation, two separate analytical panels must be created: Panel A, which covers the period from 2014 to 2021 and includes internet infrastructure as a significant predictor, and Panel B, which covers the period from 2014 to 2024 but excludes broadband access. In addition to providing a thorough analysis of the temporal dynamics underlying cloud adoption, this dual-panel approach also acts as a robustness check, enabling an evaluation of the effects of including or excluding digital infrastructure on model estimates and the reliability of the conclusions reached. It is important to note that no automated feature selection techniques were applied, as all variables were deliberately chosen based on theoretical relevance and empirical support from the existing literature.

Data from each Eurostat data file are cleaned, filtered, and merged at the country-year level. Additionally, the two datasets are harmonized by converting all time indicators to integers and filtering for the specified EU countries, whereas missing data are dealt with through listwise deletion (complete-case analysis), ensuring that only observations with complete records across all selected predictors were retained for modeling. Finally, each variable is standardized using Z-score normalization as per Equation (1):

z_{i} = \frac{x_{i} - μ_{i}}{σ_{i}}

(1)

where

μ_{i}

and

σ_{i}

denote the mean and standard deviation of

x_{i}

, respectively. This guarantees the required numerical stability for training and interpretation by forcing all predictors to be centered around zero and carry a unit standard deviation.

3.2. Modeling

The relationship between cloud adoption and the stated socio-economic and technological aspects is examined in this study using artificial intelligence (AI) techniques. In particular, the dependent variable is modeled using supervised learning methods such as random forests, support vector machines, elastic net regression, and gradient boosting machines.

By taking into account non-linearities and intricate relationships that conventional linear models could miss, the models assess cloud adoption levels based on independent variables. The relative influence of each predictor is inferred using partial dependence graphs and variable significance metrics.

Cross-validation techniques are used for confirming the models once they have been trained on the two core periods (2014–2021 and 2014–2024, respectively).

To ensure model reliability and avoid overfitting, all machine learning models were trained and evaluated using rigorous out-of-sample validation procedures. Data were randomly split into training (80%) and testing (20%) subsets, with stratification by country where appropriate to preserve representativeness. All parameter tuning, including model hyperparameters (such as the number of trees and mtry for Random Forest, learning rate and tree count for XGBoost, and regularization constants for Elastic Net and SVM), was conducted using 10-fold cross-validation on the training set, with performance assessed by RMSE and MAE. The final model evaluation was always performed on the separate, unseen test set to provide an unbiased measure of generalization performance.

For Random Forest models, we systematically varied the number of predictors sampled at each split (mtry) from 2 to 10, and explored up to 500 trees in the ensemble. XGBoost models were tuned over nrounds (50–300), max_depth (2–6), and eta (0.1–0.4). SVM models were assessed across a range of regularization constants (C). For each algorithm, the optimal configuration was chosen based on the lowest cross-validated RMSE. Dummy variables were generated prior to standardization, and both steps were embedded within the cross-validation folds to prevent data leakage.

To further evaluate the risk of overfitting, we compared training and test set metrics, reporting out-of-bag (OOB) error for Random Forests and all relevant test set metrics (RMSE, MAE, R²) for all models.

These validation practices, along with transparent reporting of parameter ranges and results, ensure the methodological soundness and reproducibility of our findings.

3.2.1. Random Forest Modeling

Given the complex, potentially nonlinear relationships among the variables influencing cloud adoption, this study employs advanced machine learning algorithms to model these associations robustly. Building on prior research [44,45], ensemble-based algorithms, particularly Random Forest regressors, are selected for their capacity to handle high-dimensional datasets, accommodate missing values, and capture intricate interaction effects. These characteristics render them especially suitable for modeling the multifaceted dynamics of digital transformation across countries and over time.

A Random Forest (RF) model estimates the relationship:

y = f (X) + ϵ

(2)

where

f (X)

is an unknown, potentially nonlinear function mapping the predictors to the dependent variable (cloud adoption), and

ϵ \sim N (0, σ^{2})

accounts for unobserved random noise.

Specifically, for the first dataset, the RF will model

C l o u d_A d o p t i o n = f (H i g h e r_E d u c a t i o n, G D P_p e r_C a p i t a, G D P, U n e m p l o y m e n t_R a t e, I C T_S p e c i a l i s t s, B r o a d b a n d_A c c e s s)

(3)

whereas it will omit the last predictor for the second one.

The RF constructs an ensemble of decision trees

{T_{1}, T_{2}, \dots, T_{m}}

, each trained on bootstrap samples of the original data. Predictions are obtained as the average across all trees:

\hat{y} = \frac{1}{m} \sum_{k = 1}^{m} T_{k} (X)

(4)

The models were trained on two distinct datasets reflecting different temporal domains, corresponding to the availability of the broadband variable. For the period 2014–2021, data include broadband access as a predictor; for 2014–2024, this variable is omitted, thus enabling a robustness check of the model’s stability and the influence of digital infrastructure.

The dataset was randomly split into training (80%) and testing (20%) subsets to evaluate out-of-sample predictive accuracy, with all splits fixed via a set seed to ensure reproducibility. To improve the forecasting trustworthiness, a linear trend analysis was performed on the historical data before modeling, and the results were then incorporated into the projections.

Approximately 10-fold cross-validation was used to enhance hyperparameter tuning, such as the number of trees (m), maximum tree depth (d), and minimum samples per split (s). This method reduces the mean squared error, or MSE:

MSE = \frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - \hat{y_{i}})}^{2}

(5)

Parallel to RF, alternative ensemble methods Extreme Gradient Boosting (XGBoost), Elastic Net regression, and Support Vector Machines (SVM with a linear kernel) were also implemented to benchmark predictive performance and assess model stability across different algorithms under identical validation schemes.

To evaluate model performance across different machine learning algorithms, we used 10-fold cross-validation, assessing each model

M_{m}

based on the Root Mean Squared Error (RMSE):

RMS E_{M_{m}} = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - \hat{y_{i}^{(M_{m})}})}^{2}}

(6)

where

\hat{y_{i}^{(M_{m})}}

is the predicted value for observation i using model

M_{m}

, and n is the number of observations in the validation fold.

Additionally, a weighted ensemble model was defined as

\hat{y_{Ensemble}} = \sum_{m = 1}^{M} w_{m} \hat{y_{i}^{(M_{m})}}

(7)

where

w_{m}

denotes the weight assigned to each base model, and weights were optimized to minimize RMSE on the validation set. The ensemble was constructed using the caretEnsemble package, which optimizes model weights to minimize cross-validated RMSE. Although SVM received zero weight in the final ensemble, it was fully trained and included in the optimization process. Model training times were recorded to benchmark efficiency.

Of note, all continuous predictors were standardized (mean zero, unit variance) prior to model training, ensuring comparability across models. For models incorporating country fixed effects, the categorical country variable (geo) was first converted into binary indicators (one-hot dummies) and then included in the feature matrix subject to the same standardization procedure. This unified preprocessing pipeline was embedded within the cross-validation framework applied to the training set, effectively preventing data leakage.

Model hyperparameters were tuned using five-fold cross-validation conducted strictly within the training data. For Random Forests, mtry (the number of predictors sampled at each split) was optimized; for XGBoost, tuning covered nrounds, max_depth, and eta; for SVM with a linear kernel, only the regularization parameter C was tuned. Elastic Net models were optimized over alpha and lambda using the glmnet framework. At no point was the holdout test set used in parameter tuning. This separation between training and evaluation ensures robust generalization and mitigates overfitting, in line with best practices in applied machine learning [18,46].

All models were implemented in R 4.3.1 using caret and caretEnsemble and trained on a MacBook Air (Apple M3 chip) with RStudio 2024.12.1.

3.2.2. Country Segmentation via Hierarchical Clustering

We used Ward’s linkage approach in conjunction with an agglomerative hierarchical clustering algorithm to find latent groupings of EU nations based on their socioeconomic and digital characteristics [47,48]. The Euclidean distance was used to determine how different countries i and j were from one another:

D (i, j) = \sqrt{\sum_{k = 1}^{p} {(X_{i k} - X_{j k})}^{2}}

(8)

where

X_{i k}

denotes the standardized value of predictor k for country i, and p is the total number of predictors.

A dendrogram that graphically depicts the hierarchical structure of nation similarity was produced by the algorithm’s iterative minimization of the within-cluster variance.

3.3. Result Interpretation

3.3.1. Explainability and Model Interpretation

In keeping with Tudor et al. [49], this study uses explainable AI (XAI) strategies to solve interpretability issues that arise in complicated, nonlinear models. Specifically, the model’s output was broken down into the contribution of each predictor for individual predictions using SHapley Additive exPlanations (SHAP) values [50,51]:

ϕ_{j} = \sum_{S \subseteq {1, \dots, p} ∖ {j}} \frac{|S|! (p - |S| - 1)!}{p!} [f (S \cup {j}) - f (S)]

(9)

where f(S) is the prediction based solely on features in subset S. By providing detailed policy insights and helping stakeholders comprehend the factors driving cloud adoption, these values provide for a thorough knowledge of changeable importance at both the local and global levels.

Individual Conditional Expectation (ICE) curves were calculated to evaluate variability in predictor effects at the observation level in order to supplement SHAP studies [52]. For each observation

(i)

and predictor

(x_{j}),

the ICE curve is defined as

{ICE}_{i} (x_{j}) = f (x_{j}, x_{- j, i})

(10)

where

x_{- j, i}

holds all other predictors fixed at their actual values for observation i, and f(⋅) represents the fitted machine learning model.

These ICE curves provide a disaggregated view of predictor influence, highlighting variation across different countries and years.

3.3.2. Partial Dependence and Interaction Effects

Partial dependence plots (PDPs) were created in order to further elucidate the impact of each predictor and how they interact. Keeping other variables at their average, these charts show the marginal impact of each predictor

x_{j}

on the target variable:

\hat{f_{x_{j}}} (x_{j}) = \frac{1}{n} \sum_{i = 1}^{n} f (x_{j}, x_{- j, i})

(11)

where

x_{- j, i}

represents all predictors except

x_{j},

fixed at the observed values for each data point

i

. PDPs offer insights into nature, i.e., linear, nonlinear, or threshold effects, of each predictor’s relationship with cloud adoption, without imposing strict parametric assumptions.

We next measured and ranked factors according to their total interaction strength to further the interaction analysis, and the pair with the strongest joint effect was ICT specialists and broadband access. An intuitive understanding of the nonlinear and multiplicative dynamics of cloud adoption was then made possible by the contour plot visualization of this interaction study.

Figure 1 includes a flowchart of the implemented method.

4. Results

4.1. Exploratory Data Analysis

The descriptive statistics of the standardized variables, which are shown in Table 2, are covered in this subsection. In order to facilitate comparability and interpretation within the random forest model, all variables were standardized to have a mean of zero and a standard deviation of one, as previously specified.

While cloud adoption shows a moderate range of 4.11, the unemployment rate (6.14), which shows significant fluctuation among EU nations and years, has the widest range. Notably, the sampled countries’ economic differences are reflected in the range of 5.46 for GDP per capita.

Asymmetry and tail behavior are emphasized by skewness and kurtosis. The majority of the variables indicate moderate skewness, but the cloud adoption rate (0.84) and unemployment rate (1.64) show a noticeable right skew, suggesting a longer upper tail.

On the other hand, GDP per capita (−0.30) and internet availability (−0.59) exhibit a slight left skew. In contrast to broadband access (0.00) and higher education (−1.03), which have relatively flat distributions when compared to a normal curve, the unemployment rate is notable for its high kurtosis (3.80), which indicates a distribution with heavy tails and possible outliers.

These descriptive insights highlight the differences among EU member states, especially with regard to digital infrastructure and labor market conditions, which are anticipated to affect cloud adoption trends in the ensuing modeling.

To illustrate the bivariate correlations between all variables, a correlation matrix (Appendix A.1) was created in addition to descriptive information. While the unemployment rate shows lower, negative associations, the figure shows large positive relationships between cloud adoption, broadband access, and ICT specialists.

Furthermore, to ensure the reliability of predictor effects, we assessed potential multicollinearity among core variables using Variance Inflation Factors (VIFs). The results indicated low multicollinearity: VIF values were 2.89 for ICT specialists, 2.77 for broadband access, and 1.91 for higher education. All variables fell well below conventional thresholds (VIF < 5), suggesting that multicollinearity is not a concern in our setting. Moreover, as our primary models are tree-based ensemble methods (e.g., Random Forest and XGBoost), which are inherently robust to moderate collinearity, no additional regularization or dimensionality reduction was required. The stability of feature importance across different algorithms and cross-validation folds further supports this robustness.

4.2. Model Performance and Variable Importance

The predictive performance of the Random Forest (RF) model was evaluated using the Root Mean Squared Error (RMSE) on the test dataset. Given the variation in cloud adoption rates among EU member states, the default RF model’s RMSE of 0.456 suggests an acceptable fit.

Using 10-fold cross-validation, we conducted hyperparameter tuning of the random forest model by exploring mtry values ranging from 2 to 10. The model’s predictive performance was relatively stable across this range, with the best configuration observed at mtry = 2, yielding an RMSE of 0.5639, R² of 0.6887, and MAE of 0.4158 (see Table 3). Performance differences among adjacent mtry values were minor, suggesting that the model was not highly sensitive to this hyperparameter in the absence of country and year fixed effects.

Variable importance was evaluated using the percentage increase in mean squared error (%IncMSE), a standard measure in random forest models that reflects how much worse the model performs when a given predictor is permuted. As per Table 4, in the tuned model, ICT specialists and broadband access are the most influential predictors, with importance scores of 29.16% and 21.23%, respectively. This affirms the central role of both digital skills and infrastructure in shaping cloud adoption outcomes across EU member states.

Economic and educational indicators played secondary but meaningful roles: GDP per capita (14.26%) and higher education (10.82%) were moderately important. In contrast, the unemployment rate continued to contribute the least (9.67%). These results are consistent with the hypothesis that digital capacity, not just socioeconomic status, is a primary driver of cloud uptake.

Figure 2 illustrates the relative importance of each variable visually, reinforcing the dominant influence of ICT-related factors on model predictions. The emphasis on digital enablers highlights key areas for targeted policy intervention.

These findings underscore that while economic and educational variables contribute meaningfully, it is digital infrastructure and workforce readiness that most strongly predict cloud adoption. The results further support policy strategies focused on broadband expansion and ICT skill development as levers for accelerating digital transformation across the EU.

4.3. Robustness Check

4.3.1. Random Forest with Country Fixed Effects

To account for unobserved structural differences between EU member states, we extended the random forest model by introducing country fixed effects using one-hot encoding of the geo variable. In addition, we retained year as a continuous numeric predictor to capture temporal trends in cloud adoption over the study period. This enhanced specification allowed the model to jointly estimate the impact of structural geographic variation and the progression of digital transformation over time.

The inclusion of country identifiers resulted in a clear improvement in predictive performance. On the test set, the model achieved a root mean squared error (RMSE) of 0.421, a mean absolute error (MAE) of 0.321, and an R² of 0.864, outperforming the tuned model without fixed effects, which had an RMSE of 0.437, MAE of 0.327, and R² of 0.853. This improvement demonstrates that incorporating country-specific heterogeneity via dummy variables, along with a continuous temporal trend, enhances the model’s ability to generalize and capture complex variation in cloud adoption across EU member states.

The updated variable importance scores are shown in Table 5. Consistent with previous models, ICT specialists and broadband access remained the most influential predictors, with %IncMSE values of 28.18% and 20.12%, respectively. The year variable also emerged as highly predictive (14.27%), indicating a clear upward trend in cloud adoption over time. In addition, several country dummies, i.e., most notably Luxembourg (geo_LU) and Bulgaria (geo_BG), ranked among the top predictors, each contributing over 8–13% to model accuracy. This highlights the significance of national context, beyond economic or digital infrastructure measures alone.

A visual summary of these scores is provided in Figure 3, which reinforces the dominant role of digital infrastructure and skills while also illustrating the added explanatory power of temporal and geographic indicators. The inclusion of fixed effects did not displace core digital predictors but rather enriched the model’s context awareness.

4.3.2. Econometric Benchmark: System GMM Estimation

To complement our machine learning analysis and address potential endogeneity in cloud adoption dynamics, we estimated a system GMM model [53,54] as a conventional econometric benchmark. This approach accounts for unobserved heterogeneity, autocorrelation, and potential reverse causality, offering a robustness check on our primary results.

The model includes lagged cloud adoption, ICT specialists, higher education, GDP per capita, broadband access, and unemployment rate as regressors. Lagged cloud adoption and ICT specialists are instrumented using their own lags in levels and differences, while the remaining predictors are treated as exogenous.

Key results show a strong and statistically significant persistence effect in cloud adoption (lag coefficient = 0.825, p < 0.001), alongside a positive and significant impact of ICT specialists (p = 0.023). Other variables—including education; income; and broadband access—do not achieve conventional significance levels in this linear specification.

These findings reinforce the results obtained from the random forest models: ICT workforce availability emerges as the most consistent and influential driver of cloud uptake across estimation strategies. Furthermore, all diagnostic tests (Sargan, AR(2)) indicate the validity of the instrument set and absence of higher-order autocorrelation.

However, the machine learning framework captures nonlinearities and complex interactions not fully accommodated by linear panel methods [55,56].

The system GMM full estimation output is reported in Appendix A.2, which also includes the Sargan and Arellano–Bond tests in detail.

4.3.3. Alternative Machine Learning Models and Ensemble Learning

To validate the robustness of our modeling approach, we developed and assessed a range of alternative machine learning models, including Extreme Gradient Boosting (XGBoost), Support Vector Machines (SVM) with a linear kernel, and Elastic Net regression, alongside our primary Random Forest (RF) model. All models were tuned via 10-fold cross-validation, and their predictive performance was compared using RMSE, MAE, and R².

Among the individual learners, Elastic Net achieved the best average performance (MAE = 0.224, RMSE = 0.297, R² = 0.907), followed closely by SVM and XGBoost. The Random Forest model, while slightly behind in overall prediction accuracy (MAE = 0.354, RMSE = 0.455, R² = 0.765), consistently demonstrated robust performance and stability across resamples.

We further combined all models into an ensemble using a greedy error-minimizing strategy. The resulting ensemble achieved the best overall MAE (0.214), assigning dominant weight to Elastic Net (75.5%) and XGBoost (22.0%), while RF contributed only marginally (2.5%). This reflects the ensemble’s bias toward models with minimal error variance in cross-validation.

However, while ensemble blending improved raw accuracy slightly, its complexity and lack of interpretability limit its practical value in policy settings. In contrast, the Random Forest model provides direct access to variable importance metrics and compatibility with SHAP explanations, making it especially useful for identifying key drivers of cloud adoption.

Crucially, across all models, ICT specialists emerged as one of the top three predictors, reaffirming the central role of digital workforce capacity (Table 6). Both XGBoost and RF ranked ICT specialists and broadband access as their top two predictors, while Elastic Net identified year and ICT specialists as the most influential variables.

These findings underscore that while alternative models may offer slight gains in predictive accuracy, Random Forest remains the most appropriate tool for this analysis due to its explanatory power, transparency, and alignment with policy needs.

To ensure transparency and reproducibility, we report the training time required for each model under standardized settings in Table 7. All models were trained on a MacBook Air 13” (Apple M3 chip) using RStudio Version 2024.12.1 and R 4.3.1, with 10-fold cross-validation for internal tuning. The computations were performed using the caret and caretEnsemble libraries.

Although the ensemble required the longest runtime, all models completed training in under 15 s, illustrating the feasibility of our approach even on lightweight consumer hardware. These timing benchmarks confirm that the proposed methods are not only statistically sound but also computationally efficient.

4.4. Explainable AI (XAI) Analysis

SHapley Additive exPlanations (SHAP) were used to better interpret the Random Forest (RF) model, providing detailed information on the ways in which each predictor influences cloud adoption. SHAP is a crucial tool for model transparency and policy interpretation since it helps quantify the average influence of each feature as well as the variability of its contribution across various projections.

The SHAP-based feature importance plot for Panel A (2014–2021), derived from the tuned Random Forest model (mtry = 10, ntree = 500), trained on the standardized dataset without country fixed effects, is shown in Figure 4. Of note, SHAP values were calculated using the iml package in R [57,58], which implements model-agnostic feature attribution based on the concept of marginal contributions from cooperative game theory [59]. Specifically, we used the FeatureImp class, which approximates permutation-based Shapley values by quantifying each feature’s marginal impact on model error (measured as increased MSE) when its values are permuted. This approach captures global importance while preserving consistency and local accuracy. The visualized SHAP plot is derived from the tuned Random Forest model trained on standardized data without fixed effects and reflects feature contributions aggregated over all test observations. The predictor object was instantiated using the actual trained randomForest model and the test dataset held out during training.

With the biggest average contributions to lowering model error (as determined by mean squared error, or MSE), the ICT specialists (ICT) and broadband access (Broadband) variables emerged as the dominating predictors. These findings demonstrate the critical role that labor competencies and digital infrastructure play in propelling cloud adoption throughout the EU.

Economic and educational metrics, such as GDP per capita (GDPpc) and higher education (HighEd), showed moderate importance, while the unemployment rate (Unemp) had the least average influence.

Notably, the plot also illustrates variability (horizontal bars), indicating how consistently each feature influences predictions. ICT and broadband access show both high average importance and wide variability, suggesting their impact fluctuates across different country-year contexts, likely reflecting disparities in both workforce skills and digital infrastructure maturity. In contrast, higher education and GDP per capita demonstrate smaller average importance and lower variability, underscoring their stable but less dominant roles.

These SHAP results reinforce and deepen the earlier random forest importance findings, confirming that broadband connectivity and ICT workforce capacity are critical drivers of cloud adoption. The combination of high mean importance and substantial variability for ICT and broadband access suggests that policy interventions to boost human resource skills and digital infrastructure can have transformative effects, particularly in lagging regions.

4.5. Nonlinear and Interaction Effects

Partial Dependence Plots (PDPs) were created for broadband access, ICT specialists, and GDP per capita in addition to a 2D interaction plot for broadband and ICT specialists in order to better understand how key variables affect cloud adoption. Nonlinear linkages and interaction effects that are not immediately visible from normal variable significance measures are revealed by these visualizations.

Figure 5 illustrates the substantial positive correlation between cloud use and broadband access. The PDP reveals a nonlinear trend: cloud adoption is still muted at very low broadband connection levels (standardized values below −1). The anticipated cloud adoption rate, however, rises dramatically as internet availability gets closer to average (about 0), plateauing at higher broadband levels. This implies that increasing broadband infrastructure above a particular point accelerates the adoption of cloud computing.

Figure 6 illustrates the relationship between ICT specialists and cloud adoption. The PDP reveals a strong, consistently positive relationship: as the share of ICT specialists increases, the predicted cloud adoption rate rises steadily, with no evident plateau. This trend underscores the critical role of digital workforce skills in facilitating cloud transformation across EU countries.

There is a complicated, nonlinear link between GDP per capita and PDP (Figure 7). Initially, cloud adoption decreases significantly as GDP per capita rises from low levels (below 0); this seems contradictory and may indicate that cloud adoption is initially influenced more by digital readiness than by economic circumstance alone. Cloud use, however, starts to rise quickly as GDP per capita surpasses average levels (about 1), indicating that wealthier economies undergo a stronger push toward digital transformation provided specific baseline criteria are satisfied.

The interaction between broadband access and ICT specialists is visualized in Figure 8. The contour plot reveals that cloud adoption is highest when both broadband access and ICT specialist density are high (top-right corner of the plot). On the other hand, the lowest anticipated cloud adoption is seen in regions with low internet and ICT levels (bottom-left). It is interesting to note that the relationship is nonlinear and synergistic: while increases in either component result in modest gains, increasing both broadband access and ICT capacity has a far greater overall impact, indicating that digital infrastructure and digital skills complement each other.

To further support the presence of nonlinear dependencies and conditional effects, we computed global feature interaction strengths using Friedman’s H-statistic, implemented via the Interaction$new() method in the iml package. This diagnostic quantifies how strongly each variable’s effect on the outcome depends on interactions with other features.

As shown in Figure 9, ICT specialists and broadband access exhibit the highest interaction strengths, indicating their predictive effects are not purely additive but conditional on other structural features. Other predictors such as GDP per capita and higher education also demonstrate notable interaction behavior. These findings reinforce earlier SHAP and PDP visualizations, emphasizing that cloud adoption in the EU arises from nonlinear, complementary relationships between human capital and digital infrastructure.

4.6. Country Clustering

We used standardized predictor variables to conduct a hierarchical clustering analysis of EU countries in order to investigate trends of digital and socioeconomic similarity. Three main clusters are identified by the resulting dendrogram (Figure 10):

(i): Austria, Germany, Portugal, Italy, and the Czech Republic are examples of mid-tier countries with balanced but moderate levels of digital infrastructure and human capital;
(ii): a cluster of digitally lagging economies, such as Bulgaria, Romania, Slovakia, Latvia, and Hungary, which are characterized by lower digital maturity and broader socio-economic challenges; and
(iii): a group of digitally advanced countries, such as Luxembourg, Ireland, the Netherlands, Sweden, and Finland, which exhibit strong digital infrastructure, high ICT specialist density, and favorable socio-economic indicators.

The clustering emphasizes significant variation among EU member states and demonstrates how technological capability, human capital, and economic development interact to drive preparedness for digital transformation. Interestingly, despite national context variations, Romania and Bulgaria are immediately clustered inside the same sub-cluster, indicating their strong resemblance in digital infrastructure and socioeconomic indices. Within this cluster, their closeness to Slovakia and Latvia emphasizes the structural difficulties that are shared by regions of Eastern and Southeastern Europe.

This segmentation highlights the need for customized solutions to close gaps in digital maturity throughout the EU landscape and provides a data-driven foundation for addressing distinct policy approaches to digital transformation.

4.7. Further Robustness Analysis: Estimations on Panel B (Excluding Broadband Access)

We conducted a complementary analysis using Panel B, which spans the extended period 2014–2024 but excludes broadband availability as a predictor, to assess the robustness of our findings over time and to examine the implications of infrastructure data loss. The tuned Random Forest (RF) model trained on this temporally broader dataset achieved a Root Mean Squared Error (RMSE) of 0.515, a Mean Absolute Error (MAE) of 0.358, and an R² of 0.737 on the test set. While this represents a moderate decline in predictive accuracy compared to Panel A (RMSE = 0.436, R² = 0.854), it still indicates strong generalization and confirms the relevance of key predictors even in the absence of broadband metrics.

As shown in Figure 11, ICT specialists remained the most influential factor (%IncMSE = 50.9), further emphasizing the pivotal role of digital workforce capacity in driving cloud adoption. Higher education (19.8%), unemployment rate (17.8%), and GDP per capita (16.4%) followed in importance, with a notable increase in the role of human capital variables. The exclusion of broadband slightly reshaped the relative contribution of remaining features, highlighting how digital infrastructure and skill development can act as partial substitutes in explaining adoption variation.

To enhance interpretability, Figure 12 presents a SHAP-based feature importance visualization for Panel B. The SHAP values reinforce the dominance of ICT specialists and higher education, with ICT contributing over four times more than any other predictor to the model’s predictive loss reduction. The tight confidence intervals across features suggest stable contributions across observations and support the robustness of the identified drivers.

Taken together, the Panel B results confirm the resilience of human capital indicators in the predictive framework and further demonstrate that, even in the absence of broadband data, structural readiness remains explainable through education, labor force, and economic indicators. However, the mild performance degradation confirms the added explanatory value of infrastructure variables, underscoring the necessity of including both human and physical capital in policy design.

5. Discussion

5.1. Trends in Cloud Adoption

The trajectory of cloud adoption in the EU from 2014 to 2024 is depicted in Figure 13, which demonstrates a clear and steady growing trend. The average percentage of businesses using cloud computing increased from over 18% in 2014 to almost 50% by 2024, demonstrating how deeply ingrained cloud technologies are in the European corporate environment. This steady expansion is in line with larger EU digitalization efforts, as cloud usage is becoming more widely acknowledged as a vital component of innovation and digital competitiveness [60].

The observed trend reflects global dynamics in which cloud computing has transitioned from a cost-reduction tool to a key enabler of business agility, scalability, and resilience [61,62]. Crucially, when businesses embraced remote work patterns and digital-first strategies, the COVID-19 pandemic served as a potent catalyst, accelerating cloud adoption significantly [63,64]. According to data through 2024, this pandemic-driven momentum was merely transient but has since blended into longer-term trends of digital transformation. Public investment, regulatory congruence, and innovation-led development are shared enabling elements for broader transformations, whether they are digital or green. Tudor [65] emphasizes how the switch to clean energy improves economic resilience through innovation, job creation, and increased energy security in addition to having positive environmental effects. These dynamics provide a helpful analogy for comprehending the systemic effects of cloud adoption and investments in digital infrastructure in the European Union.

5.2. Key Drivers of Cloud Adoption

In line with the theoretical understanding that both digital infrastructure and digital skills are crucial enablers of digital transformation, the machine learning analysis shows that ICT specialists and broadband access are the most significant predictors of cloud adoption [35,66]. This supports the European Commission’s Digital Economy and Society Index (DESI) framework, which emphasizes the dual significance of human and physical capital in promoting technology breakthroughs [36]. In fact, it is impossible to overestimate the synergy between digital skills and infrastructure; while strong infrastructure offers the tools required for transformation, the presence of workers with digital competency guarantees that these tools are used efficiently. Thus, current findings indicate that organizations must holistically approach digital transformation by investing in both advanced technological frameworks and the necessary human capital.

Furthermore, GDP per capita and higher education levels exhibit moderate but consistent importance, reflecting the broader socio-economic capacity necessary to support and sustain the adoption of advanced digital technologies [40] This is consistent with earlier studies showing that greater education is linked to greater digital literacy, which improves people’s capacity to interact with cutting-edge technologies [39]. As an illustration of how educational attainment enables people to navigate the digital landscape, a systematic evaluation revealed that those with higher education had a 37% higher likelihood of using digital platforms efficiently [39]. It is interesting to note that the robustness check (i.e., Panel B estimations) showed that human capital, as measured by ICT skills, continues to be the primary driver even in the absence of broadband data, albeit with a minor drop in predicted accuracy. This emphasizes how important workforce upskilling is as a robust component of digital adoption.

Additionally, it has been shown that regions with higher economic means typically have better internet access and educational opportunities, fostering a population more adept at technological adoption [41]. According to a review of the literature, there is a strong correlation between higher GDP per capita and digital engagement, and socioeconomic factors such as income and educational attainment have a significant impact on access to and use of digital services [67]. The socioeconomic environment, which is defined by GDP per capita and educational attainment, fosters the use of digital technologies, according to recent research. Additionally, the partial dependence graphs show nonlinear effects: ICT specialists show a consistent, linear influence on cloud adoption, while broadband access exhibits a threshold effect, indicating that beyond a certain point, additional coverage gives declining returns. This phenomenon implies that the incremental advantages of additional broadband expansion may decrease if a particular degree of coverage is attained. To put it another way, while early gains in broadband access greatly increase cloud adoption because of improved connectivity and lower latency, further coverage beyond a certain point may result in diminishing marginal benefits.

This idea is supported by earlier research. Gallardo et al. [43], for instance, point out a positive relationship between employment productivity and broadband access, indicating that while increases in connectivity initially have a major impact on economic results, these benefits decrease as broadband penetration increases. Conversely, a steady, linear connection characterizes the influence of ICT specialists on cloud adoption. According to research, every extra ICT specialist raises cloud adoption rates, indicating that their knowledge directly promotes efficient usage of cloud services. Albar and Hoque [68], for example, discovered that the availability of qualified staff had a major impact on cloud adoption choices, highlighting the importance of ICT proficiency for effective deployment. According to Tweneboah-Koduah et al. [42], companies with sufficient ICT capabilities are more likely to effectively use cloud solutions, which will improve the results of their digital initiatives.

5.3. Clustering and Heterogeneity Across EU Countries

The examination of hierarchical clustering reveals notable differences in the level of digital preparedness among EU members.

Luxembourg, the Netherlands, Finland, Ireland, and Sweden are among the top cluster of digitally advanced nations that have strong digital workforces, good broadband infrastructure, and high cloud adoption rates. According to earlier studies on digital frontrunners in Europe, these nations are defined by developed digital ecosystems and advantageous socioeconomic circumstances [69,70,71].

Mid-tier nations like Austria, Germany, Portugal, Italy, and the Czech Republic are included in a second cluster. These countries have moderate ICT workforce density and infrastructure levels, exhibiting balanced but not exceptional digital performance. In comparison to the digital leaders, their positioning implies consistent digital growth but also points to holes that still need to be filled.

Bulgaria, Romania, Slovakia, Latvia, and Hungary are among the digitally lagging economies in the third cluster, which has lower ratings on important metrics of economic development, human capital, and digital adoption. Concerns over unequal progress in attaining digital cohesion are reiterated by this cluster, which demonstrates ongoing digital inequalities inside the EU [72,73]. According to Pejić Bach et al. [74], the rise of Industry 4.0 breakthroughs is expected to significantly disadvantage populations from less educated nations, highlighting the persistent problems related to uneven technical progress throughout the EU.

Despite Romania’s relatively robust ICT infrastructure and broadband penetration rates, it is noteworthy that the two countries are clustered inside the same sub-cluster. Their combined ranking with nations like Slovakia and Latvia implies that although digital infrastructure is a vital enabler, overall digital maturity is largely shaped by broader socioeconomic factors like GDP per capita, unemployment rates, and the uptake of higher education. This finding lends credence to the idea that infrastructure by itself cannot propel digital transformation unless it is combined with concurrent investments in equitable economic growth, institutional capacity, and education [75].

5.4. Policy Implications

To sum up, the current results suggest that cloud adoption is shaped by a combination of structural readiness, national context, and temporal evolution. Policymakers aiming to boost digital uptake should therefore consider not only investments in infrastructure and skills but also country-specific conditions and trends that may influence adoption trajectories.

Importantly, our Random Forest model with country dummies reveals substantial cross-national heterogeneity in predictive influence. Country indicators such as geo_LU (Luxembourg) and geo_FI (Finland) had among the highest importance scores, reflecting advanced infrastructure and digital maturity. Conversely, dummies for geo_BG (Bulgaria) and geo_RO (Romania) also emerged as significant, but in the context of lower adoption baselines—suggesting structural challenges that still materially shape outcomes. These results highlight the need for differentiated, nation-specific policy strategies. For instance, countries in Southern and Eastern Europe with lower digital workforce capacity and broadband saturation may require greater investment in ICT skill-building and last-mile infrastructure. In contrast, high-performing nations like Finland and Luxembourg may benefit more from innovation incentives and policy harmonization.

Several broader empirical patterns further contextualize these findings. For instance, the digital divide in Europe is multidimensional and persistent. Internet access or adoption is considered the first-level digital divide in Europe [76]. The Digital Agenda for Europe aimed to enhance internet access and adoption among all European people, particularly through initiatives that promote digital literacy and accessibility [77]. While the Digital Agenda for Europe and the EU Digital Decade Policy Programme have emphasized universal access, disparities remain pronounced in rural and underserved regions, especially in Eastern and Southern Europe.

Furthermore, online engagement and the range of activities performed are identified as the second-level digital divide. Education and income are the most consistent predictors of internet engagement: better educated people have stronger internet awareness and are more capable of evaluating online material, whereas less educated people have weak internet abilities [78]. The digital gap in Europe is tangible and multifaceted, as indicated by empirical data [79]. Current studies indicate that digital inequality leads to disparities in digital human capital [80].

In this context, our models identify ICT specialist density and higher education as consistent, high-impact predictors of cloud adoption. This underscores the continued importance of investing in human capital and digital skills, particularly in Southern and Eastern European countries that score lower on these metrics. Notably, although Romania and Bulgaria exhibit strong technical indicators in some domains, these have not yet translated into equivalent levels of adoption, suggesting a need for broader socio-institutional reforms [81]. Likewise, Portugal’s high school dropout rate [76] may help explain its relatively lagging performance, despite otherwise improving digital infrastructure. In contrast, advanced digital economies such as Finland and Luxembourg should now focus on consolidating leadership positions through policy harmonization, institutional streamlining, and strategic innovation incentives.

Additionally, while this study focuses on socio-technological determinants of adoption, our findings conceptually align with research on system-level efficiency, such as Ali et al. [82], whose VMR algorithm addresses quality-of-service and energy-aware orchestration in cloud environments. Both perspectives emphasize the need for intelligent infrastructure planning, whether through backend orchestration or national readiness initiatives.

Finally, the empirical dominance of ICT specialists and broadband access in our models provides quantitative validation for key priorities outlined in the EU Digital Decade Policy Programme (2021–2030), particularly the targets of reaching 75% cloud adoption by businesses, ensuring gigabit connectivity across all EU households, and cultivating a digitally skilled workforce with at least 20 million ICT specialists employed across the Union. Our results therefore offer not only methodological insights but also actionable input for tailored policy deployment, aligned with instruments such as the Digital Europe Programme (DEP) [83,84] and the Recovery and Resilience Facility (RRF).

6. Conclusions

This study set out to examine the key macro-level determinants of cloud adoption across 27 EU member states by integrating socio-economic, educational, and technological variables into a machine learning framework. Using two harmonized panels (2014–2021 and 2014–2024), we trained and validated several models, including Random Forest, XGBoost, and SVM, to predict national cloud uptake rates. The models were further interpreted using explainable AI techniques such as SHAP values, ICE curves, and clustering analysis, yielding both predictive insights and actionable interpretations.

The findings consistently highlight the dominant role of digital human capital and infrastructure in enabling cloud transformation. Specifically, the presence of ICT specialists and broadband access emerged as the most influential predictors of cloud adoption. Higher levels of tertiary education and GDP per capita also contributed positively, albeit with a more moderate effect. Conversely, higher unemployment rates were negatively associated with cloud readiness, likely reflecting reduced capacity for investment and innovation. Partial dependence and interaction plots revealed important nonlinearities and synergies, especially between digital skills and broadband infrastructure, while hierarchical clustering uncovered clear groupings of digitally advanced, mid-tier, and lagging countries.

These insights carry strong policy relevance. First, they suggest that strengthening digital skills through vocational and higher education, particularly in ICT-related fields, remains a critical priority at the EU level. Second, investment in broadband infrastructure, especially in rural and underserved regions, is essential to unlock the full potential of cloud-based services. Third, the clustering analysis underscores the need for differentiated strategies across EU member states. For digitally lagging countries, simultaneous investment in infrastructure and human capital is vital, whereas mid-tier nations may benefit from targeted innovation policies and harmonized digital regulation. In contrast, advanced economies should focus on consolidating digital leadership and addressing institutional bottlenecks that may hinder further cloud scaling. Countries such as Romania and Bulgaria, despite strong technical indicators, may require broader socio-economic interventions to translate digital capacity into adoption outcomes.

Nevertheless, this study is not without limitations. The absence of broadband data beyond 2021 constrained full-panel modeling, limiting the temporal horizon of some analyses. While machine learning models and explainability tools offer rich insights, they remain correlational in nature; causal inference would benefit from complementary approaches such as natural experiments or instrumental variable designs. Moreover, national-level analyses cannot capture the heterogeneity that exists within countries. Thus, future studies could incorporate firm-level or regional microdata to assess intra-country disparities. Finally, while our model captures cross-sectional temporal variation, future research could benefit from explicitly incorporating dynamic modeling frameworks to account for delayed policy impacts and feedback effects over time. Moreover, future research could extend the comparative scope beyond the EU27 by including EFTA countries or adjacent regions. Such a comparison would help assess divergence in structural readiness, regulatory environments, and digital policy implementation. Also, future studies could explore how national progress toward Digital Decade targets interacts with cloud adoption over time, potentially leveraging dynamic policy indicators or national digital investment scores.

Author Contributions

Conceptualization, C.T.; methodology, C.T.; software, C.T.; validation, C.T. and M.F.; formal analysis, C.T.; investigation, C.T., M.F., P.S., V.V. and P.P.; resources, C.T.; data curation, C.T.; writing—original draft preparation, C.T., M.F., P.S., V.V., P.P. and K.K.; visualization, C.T.; supervision, C.T.; project administration, C.T.; funding acquisition, C.T. and M.F. All authors have read and agreed to the published version of the manuscript.

Funding

This work was partially funded by the EUs NextGenerationEU instrument through the National Recovery and Resilience Plan of Romania—Pillar III-C9-I8, managed by the Ministry of Research, Innovation, and Digitalization, within the project with code CF 158/31.07.2023, contract no. 760248/28.12.2023. Also, it has been partially supported by a grant offered by the Romanian National Commission for Financing Higher Education (CNFIS) through the Institutional Development Fund for state universities, for the project “Promoting excellence in research through interdisciplinarity, digitalization and integration of Open Science principles to increase international visibility (ASE-RISE)”, contract number: CNFIS-FDI-2025-F-0457.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

All data are publicly available from Eurostat. Dataset codes and URLs: Cloud adoption (isoc_cicce_use): https://ec.europa.eu/eurostat/databrowser/view/isoc_cicce_use/; Higher education (edat_lfs_9903): https://ec.europa.eu/eurostat/databrowser/view/edat_lfs_9903/; GDP per capita (nama_10_gdp): https://ec.europa.eu/eurostat/databrowser/view/nama_10_gdp/; Unemployment (une_rt_m): https://ec.europa.eu/eurostat/databrowser/view/une_rt_m/; ICT specialists (isoc_sks_itspt): https://ec.europa.eu/eurostat/databrowser/view/isoc_sks_itspt/; Broadband access (isoc_r_broad_h): https://ec.europa.eu/eurostat/databrowser/view/isoc_r_broad_h/. All data was last accessed on 20 June 2025.

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Abbreviations

The following abbreviations are used in this manuscript:

AI	Artificial Intelligence
DOG	Digital Open Government
EU	European Union
GDP	Gross Domestic Product
ICT	Information and Communication Technologies
ICE	Individual Conditional Expectation
ML	Machine Learning
NRI	Network Readiness Index
PDP	Partial Dependence Plot
RF	Random Forest
SHAP	SHapley Additive exPlanations
SVM	Support Vector Machine
XAI	Explainable Artificial Intelligence
XGBoost	eXtreme Gradient Boosting

Appendix A

Appendix A.1. Correlation Analysis

Figure A1. Correlation Matrix of Cloud Adoption Predictors.

Appendix A.2. System GMM Results

Table A1. System GMM Estimation Results for Cloud Adoption (EU Countries, 2014–2021).

Predictor	Estimate	Std. Error	z-Value	p-Value	Signif.
lag (Cloud Adoption, 1)	0.825	0.062	13.37	<0.001	***
Higher Education	−0.109	0.109	−1.01	0.315
GDP per Capita	−0.081	0.126	−0.64	0.520
Unemployment Rate	0.026	0.098	0.27	0.791
ICT Specialists	3.227	1.417	2.28	0.023	*
Broadband Access	0.069	0.171	0.40	0.688

Significance codes: *** p < 0.001, * p < 0.05.

Model Statistics:

Unbalanced Panel: n = 26 countries, T = 4–7 years, N = 162 (138 obs. used)
Sargan Test: χ²(25) = 18.57, p = 0.82 (instruments valid)
AR(1) Test: z = −2.64, p = 0.008 (expected in differences)
AR(2) Test: z = 1.05, p = 0.30 (no second-order autocorrelation)
Wald Test for All Coefficients: χ²(6) = 4468.7, p < 0.001

References

Garrison, G.; Wakefield, R.L.; Kim, S. The effects of IT capabilities and delivery model on cloud computing success and firm performance for cloud supported processes and operations. Int. J. Inf. Manag. 2015, 35, 377–393. [Google Scholar] [CrossRef]
Khayer, A.; Bao, Y.; Nguyen, B. Understanding cloud computing success and its impact on firm performance: An integrated approach. Ind. Manag. Data Syst. 2020, 120, 963–985. [Google Scholar] [CrossRef]
Khayer, A.; Talukder, S.; Bao, Y.; Hossain, N. Cloud computing adoption and its impact on SMEs’ performance for cloud supported operations: A dual-stage analytical approach. Technol. Soc. 2020, 60, 101225. [Google Scholar] [CrossRef]
Ali, A.; Ullah, I.; Ahmad, S.; Wu, Z.; Li, J.; Bai, X. An attention-driven spatio-temporal deep hybrid neural networks for traffic flow prediction in transportation systems. IEEE Trans. Intell. Transp. Syst. 2025. [Google Scholar] [CrossRef]
Ali, A.; Ullah, I.; Singh, S.K.; Jiang, W.; Alturise, F.; Bai, X. Attention-Driven Graph Convolutional Networks for Deadline-Constrained Virtual Machine Task Allocation in Edge Computing. IEEE Trans. Consum. Electron. 2025. [Google Scholar] [CrossRef]
Alenizi, A.S.; Al-karawi, K.A. Cloud Computing Adoption-Based Digital Open Government Services: Challenges and Barriers. In Lecture Notes in Networks and Systems, Proceedings of Sixth International Congress on Information and Communication Technology, London, UK, 25–26 February 2021; Yang, X.S., Sherratt, S., Dey, N., Joshi, A., Eds.; Springer: Singapore, 2022; Volume 216. [Google Scholar] [CrossRef]
Vu, K.; Hartley, K.; Kankanhalli, A. Predictors of cloud computing adoption: A cross-country study. Telemat. Inform. 2020, 52, 101426. [Google Scholar] [CrossRef]
Mitra, A.; O’Regan, N.; Sarpong, D. Cloud resource adaptation: A resource based perspective on value creation for corporate growth. Technol. Forecast. Soc. Chang. 2018, 130, 28–38. [Google Scholar] [CrossRef]
Katz, R.; Jung, J. Economic spillovers from cloud computing: Evidence from OECD countries. Inf. Technol. Dev. 2024, 30, 173–194. [Google Scholar] [CrossRef]
Tripathy, S.; Jyotishi, A. Macro Factors Affecting Cloud Computing Readiness: A Cross-Country Analysis. In Lecture Notes in Electrical Engineering, Proceedings of the Advances in Data Sciences, Security and Applications, Jaipur, India, 20–22 December 2019; Jain, V., Chaudhary, G., Taplamacioglu, M., Agarwal, M., Eds.; Springer: Singapore, 2020; Volume 612. [Google Scholar] [CrossRef]
Tripathy, S.; Sengupta, A.; Jyotishi, A. Where do countries stand in cloud computing readiness? A country-level analysis of capacity and potential. J. Inf. Technol. Politics 2023, 20, 469–483. [Google Scholar] [CrossRef]
Karamujic, L. Impact of national institutions on cloud computing adoption. Comparison to mobile broadband adoption. J. Glob. Inf. Technol. Manag. 2025, 28, 6–29. [Google Scholar] [CrossRef]
Senyo, P.K.; Addae, E.; Boateng, R. Cloud computing research: A review of research themes, frameworks, methods and future research directions. Int. J. Inf. Manag. 2018, 38, 128–139. [Google Scholar] [CrossRef]
Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar] [CrossRef]
Lundberg, S.M.; Lee, S.I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 2017, 30, 1–10. Available online: https://proceedings.neurips.cc/paper/2017/file/8a20a8621978632d76c43dfd28b67767-Paper.pdf (accessed on 7 May 2025).
Hastie, T.; Tibshirani, R.; Friedman, J.H.; Friedman, J.H. The Elements of Statistical Learning: Data Mining, Inference, and Prediction; Springer: New York, NY, USA, 2009; Volume 2, pp. 1–758. [Google Scholar]
James, G.; Witten, D.; Hastie, T.; Tibshirani, R. An Introduction to Statistical Learning; Springer: New York, NY, USA, 2013; Volume 103. [Google Scholar]
Straub, D. The effect of culture on it diffusion: E-Mail and FAX in Japan and the U.S. Inf. Syst. Res. 1994, 5, 23–47. [Google Scholar] [CrossRef]
Girlovan, A.; Tudor, C.; Saiu, G.R.; Guse, D.D. Exploring the impact of globalization and economic-energy dynamics on environmental sustainability in the EU. Glob. Transit. 2025, 7, 41–55. [Google Scholar] [CrossRef]
Tuguskina, G.; Rozhkova, L.; Taktarova, S.; Salnikova, O. The Role Of Human Capital In The Digital Economy. In Global Challenges and Prospects of the Modern Economic Development, 57; Mantulenko, V., Ed.; European Proceedings of Social and Behavioural Sciences; Future Academy: Cupertino, CA, USA, 2019; pp. 960–968. [Google Scholar] [CrossRef]
Stofkova, J.; Poliakova, A.; Stofkova, K.R.; Malega, P.; Krejnus, M.; Binasova, V.; Daneshjo, N. Digital skills as a significant factor of human resources development. Sustainability 2022, 14, 13117. [Google Scholar] [CrossRef]
Zaborovskaia, O.; Nadezhina, O.; Avduevskaya, E. The impact of digitalization on the formation of human capital at the regional level. J. Open Innov. Technol. Mark. Complex. 2020, 6, 184. [Google Scholar] [CrossRef]
Tudose, M.B.; Georgescu, A.; Avasilcăi, S. Global analysis regarding the impact of digital transformation on macroeconomic outcomes. Sustainability 2023, 15, 4583. [Google Scholar] [CrossRef]
Apostol, S. Digitalization and platformization in Romania based on the Digital Platform Economy Index 2020. Cent. Eur. Bus. Rev. 2023, 12, 77–103. [Google Scholar] [CrossRef]
Greenstein, S. Digital infrastructure. In Economic Analysis and Infrastructure Investment; Glaeser, E.L., Poterba, J.M., Eds.; University of Chicago Press: Chicago, IL, USA, 2019; pp. 409–447. [Google Scholar]
Sun, P.; Sun, P. ICT infrastructure required for digital transformation. In Unleashing the Power of 5GtoB in Industries; Springer: Singapore, 2021; pp. 13–27. [Google Scholar] [CrossRef]
Briglauer, W.; Krämer, J.; Palan, N. Socioeconomic benefits of high-speed broadband availability and service adoption: A survey. Telecommun. Policy 2024, 48, 102808. [Google Scholar] [CrossRef]
Oughton, E.J.; Frias, Z.; Van Der Gaast, S.; Van Der Berg, R. Assessing the capacity, coverage and cost of 5G infrastructure strategies: Analysis of the Netherlands. Telemat. Inform. 2019, 37, 50–69. [Google Scholar] [CrossRef]
Eswaran, S.; Honnavalli, P. Private 5G networks: A survey on enabling technologies, deployment models, use cases and research directions. Telecommun. Syst. 2023, 82, 3–26. [Google Scholar] [CrossRef] [PubMed]
Holmström, J. From AI to digital transformation: The AI readiness framework. Bus. Horiz. 2022, 65, 329–339. [Google Scholar] [CrossRef]
Shonubi, O.A. Advancing organisational technology readiness and convergence of emerging digital technologies (AI, IoT, I4.0) for innovation adoption. Int. J. Technol. Glob. 2024, 9, 50–91. [Google Scholar] [CrossRef]
Aftab, J.; Stan, M.R.; Srivastava, M.; Wei, F.; Abid, N. The Impact of Digital Leadership on Performance: Examining the Roles of Big Data Analytical Capabilities, Green Innovation, and AI Change Readiness in Italian SMEs. Bus. Strategy Environ. 2025, in press. [CrossRef]
Alfadhli, M.; Onat, N.C.; Kucukvar, M.; Al-Maadeed, S. Analyzing AI Readiness through Digital Transformation and Data Management: A Case Study of Qatar’s Government Sector. Appl. Math. Inf. Sci. 2025, 19, 497–507. [Google Scholar] [CrossRef]
Cennamo, C.; Dagnino, G.B.; Di Minin, A.; Lanzolla, G. Managing digital transformation: Scope of transformation and modalities of value co-generation and delivery. Calif. Manag. Rev. 2020, 62, 5–16. [Google Scholar] [CrossRef]
European Commission. Digital Economy and Society Index (DESI) 2023. 2023. Available online: https://digital-strategy.ec.europa.eu/en/policies/desi (accessed on 7 May 2025).
Kumar, D.; Samalia, H.V.; Verma, P. Exploring suitability of cloud computing for small and medium-sized enterprises in India. J. Small Bus. Enterp. Dev. 2017, 24, 814–832. [Google Scholar] [CrossRef]
Uddin, A.; Cetindamar, D.; Hawryszkiewycz, I.; Sohaib, O. The role of dynamic cloud capability in improving sme’s strategic agility and resource flexibility: An empirical study. Sustainability 2023, 15, 8467. [Google Scholar] [CrossRef]
Goldberg, N.; Leminski, C.; Gion, P.; Hautsch, V.; Hefter, K.; Langebartels, G.; Pfaff, H.; Ansmann, L.; Karbach, U.; Wurster, F. Socio-demographic and socio-economic determinants for the utilization of digital patient portals in hospitals: Systematic review and meta-analysis on the digital divide. J. Med. Internet Res. 2025, 27, e68091. [Google Scholar] [CrossRef] [PubMed]
Brynjolfsson, E.; McAfee, A. The Second Machine Age: Work, Progress, and Prosperity in a Time of Brilliant Technologies; WW Norton & Company: New York, NY, USA, 2014. [Google Scholar]
Afzal, A.; Khan, S.; Daud, S.; Ahmad, Z.; Butt, A. Addressing the digital divide: Access and use of technology in education. J. Soc. Sci. Rev. 2023, 3, 883–895. [Google Scholar] [CrossRef]
Tweneboah-Koduah, S.; Endicott-Popovsky, B.; Tsetse, A. Barriers to government cloud adoption. Int. J. Manag. Inf. Technol. 2014, 6, 1–16. [Google Scholar] [CrossRef]
Gallardo, R.; Whitacre, B.; Kumar, I.; Upendram, S. Broadband metrics and job productivity: A look at county-level data. Ann. Reg. Sci. 2021, 66, 161–184. [Google Scholar] [CrossRef]
McAlexander, R.J.; Mentch, L. Predictive inference with random forests: A new perspective on classical analyses. Res. Politics 2020, 7, 2053168020905487. [Google Scholar] [CrossRef]
Shah, A.; Bartlett, J.; Carpenter, J.; Nicholas, O.; Hemingway, H. Comparison of random forest and parametric imputation models for imputing missing data using mice: A caliber study. Am. J. Epidemiol. 2014, 179, 764–774. [Google Scholar] [CrossRef] [PubMed]
Kuhn, M.; Johnson, K. Applied Predictive Modeling; Springer: Berlin/Heidelberg, Germany, 2019. [Google Scholar]
Ward, J.H., Jr. Hierarchical grouping to optimize an objective function. J. Am. Stat. Assoc. 1963, 58, 236–244. [Google Scholar] [CrossRef]
Ward, J.H., Jr.; Hook, M.E. Application of an hierarchical grouping procedure to a problem of grouping profiles. Educ. Psychol. Meas. 1963, 23, 69–81. [Google Scholar] [CrossRef]
Tudor, C.; Sova, R.; Stamatiou, P.; Vlachos, V.; Polychronidou, P. Future-Proofing EU-27 Energy Policies with AI: Analyzing and Forecasting Fossil Fuel Trends. Electronics 2025, 14, 631. [Google Scholar] [CrossRef]
Van Hoang, S.; Nguyen, K.M.; Huynh, T.M.; Huynh, K.L.A.; Nguyen, P.H.; Tran, H.P.N. Chest X-ray severity score as a putative predictor of clinical outcome in hospitalized patients: An experience from a Vietnamese COVID-19 field hospital. Cureus 2022, 14, e23323. [Google Scholar] [CrossRef] [PubMed]
Cohen, J.; Byon, E.; Huan, X. To trust or not: Towards efficient uncertainty quantification for stochastic shapley explanations. In Proceedings of the PHM Society Asia-Pacific Conference, Seoul, Republic of Korea, 5–7 December 2023; Volume 4. [Google Scholar]
Goldstein, A.; Kapelner, A.; Bleich, J.; Pitkin, E. Peeking inside the black box: Visualizing statistical learning with plots of individual conditional expectation. J. Comput. Graph. Stat. 2015, 24, 44–65. [Google Scholar] [CrossRef]
Arellano, M.; Bover, O. Another look at the instrumental variable estimation of error-components models. J. Econom. 1995, 68, 29–51. [Google Scholar] [CrossRef]
Blundell, R.; Bond, S. Initial conditions and moment restrictions in dynamic panel data models. J. Econom. 1998, 87, 115–143. [Google Scholar] [CrossRef]
Varian, H.R. Big data: New tricks for econometrics. J. Econ. Perspect. 2014, 28, 3–28. [Google Scholar] [CrossRef]
Athey, S. The impact of machine learning on economics. In The Economics of Artificial Intelligence: An Agenda; University of Chicago Press: Chicago, IL, USA; pp. 507–547.
Molnar, C. Interpretable Machine Learning: A Guide for Making Black Box Models Explainable. 2022. Available online: https://christophm.github.io/interpretable-ml-book/ (accessed on 7 May 2025).
Molnar, C.; Casalicchio, G.; Bischl, B. iml: An R package for interpretable machine learning. J. Open Source Softw. 2018, 3, 786. [Google Scholar] [CrossRef]
Štrumbelj, E.; Kononenko, I. Explaining prediction models and individual predictions with feature contributions. Knowl. Inf. Syst. 2014, 41, 647–665. [Google Scholar] [CrossRef]
European Commission. Digital Education Action Plan (2021–2027). 2021. Available online: https://education.ec.europa.eu/focus-topics/digital-education/action-plan (accessed on 7 May 2025).
Marston, S.; Li, Z.; Bandyopadhyay, S.; Zhang, J.; Ghalsasi, A. Cloud computing—The business perspective. Decis. Support Syst. 2011, 51, 176–189. [Google Scholar] [CrossRef]
Sultan, N. Making use of cloud computing for healthcare provision: Opportunities and challenges. Int. J. Inf. Manag. 2014, 34, 177–184. [Google Scholar] [CrossRef]
Sharma, M.; Singh, A.; Daim, T. Exploring cloud computing adoption: COVID era in academic institutions. Technol. Forecast. Soc. Change 2023, 193, 122613. [Google Scholar] [CrossRef]
Alashhab, Z.R.; Anbar, M.; Singh, M.M.; Leau, Y.B.; Al-Sai, Z.A.; Alhayja’a, S.A. Impact of coronavirus pandemic crisis on technologies and cloud computing applications. J. Electron. Sci. Technol. 2021, 19, 100059. [Google Scholar] [CrossRef]
Tudor, C. Opportunities in clean energy equity markets: The compelling case for nuclear energy investments. J. Bus. Econ. Manag. 2024, 25, 960–980. [Google Scholar] [CrossRef]
Zhang, X.; Xu, Y.; Ma, L. Research on successful factors and influencing mechanism of the digital transformation in SMEs. Sustainability 2022, 14, 2549. [Google Scholar] [CrossRef]
Vassilakopoulou, P.; Hustad, E. Bridging digital divides: A literature review and research agenda for information systems research. Inf. Syst. Front. 2023, 25, 955–969. [Google Scholar] [CrossRef] [PubMed]
AlBar, A.M.; Hoque, M.R. Factors affecting cloud ERP adoption in Saudi Arabia: An empirical study. Inf. Dev. 2019, 35, 150–164. [Google Scholar] [CrossRef]
Marino, A.; Pariso, P. Digital government platforms: Issues and actions in Europe during pandemic time. Entrep. Sustain. Issues 2021, 9, 462. [Google Scholar] [CrossRef] [PubMed]
Kovács, T.Z.; Bittner, B.; Huzsvai, L.; Nábrádi, A. Convergence and the Matthew effect in the European union based on the DESI index. Mathematics 2022, 10, 613. [Google Scholar] [CrossRef]
Firoiu, D.; Pîrvu, R.; Jianu, E.; Cismaș, L.M.; Tudor, S.; Lățea, G. Digital performance in EU member states in the context of the transition to a climate neutral economy. Sustainability 2022, 14, 3343. [Google Scholar] [CrossRef]
Tudor, C.; Sova, R. Driving factors for R&D intensity: Evidence from global and income-level panels. Sustainability 2022, 14, 1854. [Google Scholar]
Cruz-Jesus, F.; Oliveira, T.; Bacao, F. Digital divide across the European Union. Inf. Manag. 2012, 49, 278–291. [Google Scholar] [CrossRef]
Pejić Bach, M.; Bertoncel, T.; Meško, M.; Suša Vugec, D.; Ivančić, L. Big data usage in european countries: Cluster analysis approach. Data 2020, 5, 25. [Google Scholar] [CrossRef]
Van Deursen, A.J.; Van Dijk, J.A. The first-level digital divide shifts from inequalities in physical access to inequalities in material access. New Media Soc. 2019, 21, 354–375. [Google Scholar] [CrossRef] [PubMed]
Gomes, A.; Dias, J. Digital Divide in the European Union: A Typology of EU Citizens. Soc. Indic. Res. 2025, 176, 149–172. [Google Scholar] [CrossRef]
Giannone, D.; Santaniello, M. Governance by indicators: The case of the Digital Agenda for Europe. Inf. Commun. Soc. 2019, 22, 1889–1902. [Google Scholar] [CrossRef]
Helsper, E.J.; Van Deursen, A.J. Do the rich get digitally richer? Quantity and quality of support for digital engagement. Inf. Commun. Soc. 2017, 20, 700–714. [Google Scholar] [CrossRef]
Alvarez-Galvez, J.; Salinas-Perez, J.A.; Montagni, I.; Salvador-Carulla, L. The persistence of digital divides in the use of health information: A comparative study in 28 European countries. Int. J. Public Health 2020, 65, 325–333. [Google Scholar] [CrossRef] [PubMed]
Carpio, G.G. Racial projections: Cyberspace, public space, and the digital divide. Inf. Commun. Soc. 2018, 21, 174–190. [Google Scholar] [CrossRef]
Hunady, J.; Pisar, P.; Vugec, D.S.; Bach, M.P. Digital Transformation in European Union: North is leading, and South is lagging behind. Int. J. Inf. Syst. Proj. Manag. 2022, 10, 4. [Google Scholar] [CrossRef]
Ali, R.; Shen, Y.; Huang, X.; Zhang, J.; Ali, A. VMR: Virtual machine replacement algorithm for QoS and energy-awareness in cloud data centers. In Proceedings of the 2017 IEEE International Conference on Computational Science and Engineering (CSE) and IEEE International Conference on Embedded and Ubiquitous Computing (EUC), Guangzhou, China, 21–24 July 2017; Volume 2, pp. 230–233. [Google Scholar]
European Commission. The Digital Europe Programme. 2025. Available online: https://digital-strategy.ec.europa.eu/en/activities/digital-programme (accessed on 20 June 2025).
European Commission. The Recovery and Resilience Facility. 2025. Available online: https://commission.europa.eu/business-economy-euro/economic-recovery/recovery-and-resilience-facility_en (accessed on 20 June 2025).

Figure 1. Method flowchart. Source: Constructed by the authors in Overleaf using the TikZ package. The TikZ code used for this figure is available upon request. Note: Broadband access available only until 2021.

Figure 2. Variable importance scores from the tuned random forest model on Panel A (no fixed effects). Source: Authors’ estimations in R software Version 2024.12.1+563.

Figure 3. Variable importance scores from the tuned random forest model with fixed effects on Panel A. Source: Authors’ estimations in R software.

Figure 4. SHAP summary plot for the tuned Random Forest model (without country dummies), showing the relative impact of features on predicted cloud adoption. (Panel A: 2014–2021). Source: Authors’ estimations in R software.

Figure 5. Partial Dependence Plot of Broadband Access → Cloud Adoption. Source: Authors’ estimations in R software.

Figure 6. Partial Dependence Plot of ICT Specialists → Cloud Adoption. Source: Authors’ estimations in R software.

Figure 7. Partial Dependence Plot of GDP per Capita → Cloud Adoption. Source: Authors’ estimations in R software.

Figure 8. Interaction Plot: Broadband Access and ICT Specialists → Cloud Adoption. Source: Authors’ estimations in R software.

Figure 9. Friedman’s H-statistic ranking of global interaction strength across predictors, derived from the tuned Random Forest model. Higher values reflect stronger dependence on interactions with other variables. Source: Authors’ estimations in R software.

Figure 10. Hierarchical Clustering of EU Countries Based on Digital and Socio-Economic Indicators (2014–2021). Source: Authors’ estimations in R software.

Figure 11. Variable importance scores for Panel B (Random Forest model). Source: Authors’ estimations in R software.

Figure 12. SHAP-based feature importance (Panel B, without broadband). Source: Authors’ estimations in R software.

Figure 13. Trend in Cloud Adoption Across EU Countries (2014–2024). Source: Authors’ estimations in R software.

Table 1. Variable Definitions, Sources, and Units.

Variable	Definition	Source (Eurostat Code/URL)	Unit	Years Available	Notes
Cloud Adoption	% of enterprises (10+ employees) using cloud computing	isoc_cicce_use/	PC_ENT	2014–2024	Dependent variable; filtered by size GE10, unit PC_ENT, indicator E_CC
Higher Education	% of population aged 25–64 with ISCED 5–8	edat_lfs_9903	PC	2014–2024	Filtered for age Y 25–64, ISCED ED5-8, sex T, unit PC
GDP per Capita	GDP per person (constant prices)	nama_10_gdp	PC_GDP	2014–2024	Filtered by unit PC_GDP from national accounts
Unemployment Rate	% of active labor force unemployed	une_rt_m	PC_ACT	2014–2024	Filtered by NSA, sex T, age TOTAL, unit PC_ACT
ICT Specialists	% of employed workforce in ICT roles	isoc_sks_itspt	PC_EMP	2014–2024	Filtered by unit PC_EMP
Broadband Access	% of households with fixed broadband	isoc_r_broad_h	PC_HH	2014–2021	Filtered by unit PC_HH; limited to data until 2021

Note: All dataset codes and URLs are provided here: Cloud adoption (isoc_cicce_use): https://ec.europa.eu/eurostat/databrowser/view/isoc_cicce_use/; Higher education (edat_lfs_9903): https://ec.europa.eu/eurostat/databrowser/view/edat_lfs_9903/; GDP per capita (nama_10_gdp): https://ec.europa.eu/eurostat/databrowser/view/nama_10_gdp/; Unemployment (une_rt_m): https://ec.europa.eu/eurostat/databrowser/view/une_rt_m/; ICT specialists (isoc_sks_itspt): https://ec.europa.eu/eurostat/databrowser/view/isoc_sks_itspt/; Broadband access (isoc_r_broad_h): https://ec.europa.eu/eurostat/databrowser/view/isoc_r_broad_h/. All data was last accessed on 20 June 2025.

Table 2. Descriptive statistics (Panel A, EU 27, 2014–2021).

Variable	Median	Trimmed Mean	Mad	Min	Max	Range	Skewness	Kurtosis
Cloud Adoption	−0.26	−0.10	0.91	−1.40	2.71	4.11	0.84	−0.11
Higher Education	0.03	0.01	1.21	−1.97	2.27	4.24	−0.04	−1.03
Gdp per Capita	0.22	0.03	0.96	−2.82	2.64	5.46	−0.30	0.63
Unemployment Rate	−0.20	−0.13	0.69	−1.55	4.59	6.14	1.64	3.80
ICT Specialists	−0.23	−0.09	0.97	−1.89	3.05	4.94	0.78	0.22
Broadband Access	0.10	0.07	1.07	−3.04	1.78	4.82	−0.59	0.00

Source: Authors’ calculations using Eurostat data.

Table 3. Cross-Validation Results for Tuned Random Forest Models on Panel A.

MTRY	RMSE	R²	MAE	RMSE SD	R² SD	MAE SD
2	0.5639	0.6887	0.4158	0.0394	0.0987	0.0320
3	0.5654	0.6850	0.4190	0.0475	0.1031	0.0335
4	0.5639	0.6837	0.4217	0.0397	0.0921	0.0321
5	0.5641	0.6822	0.4211	0.0418	0.0949	0.0315
6	0.5648	0.6854	0.4202	0.0455	0.0915	0.0306
7	0.5640	0.6852	0.4194	0.0414	0.0883	0.0300
8	0.5675	0.6797	0.4227	0.0511	0.0960	0.0380
9	0.5662	0.6814	0.4209	0.0466	0.0953	0.0315
10	0.5646	0.6827	0.4219	0.0437	0.0904	0.0307

Source: Authors’ calculations using Eurostat data.

Table 4. Tuned Random Forest Model, Panel A, No Dummies.

Predictor	%IncMSE	IncNodePurity
ICT Specialists	29.16	39.84
Broadband Access	21.23	34.23
GDP per Capita	14.26	11.38
Higher Education	10.82	18.56
Unemployment Rate	9.67	9.08

Source: Authors’ calculations.

Table 5. Variable Importance Scores—Tuned Random Forest Model with Country Fixed Effects (Panel A).

Rank	Variable	%IncMSE	IncNodePurity	Type
1	ICT Specialists	28.18	33.32	Core Variable
2	Broadband Access	20.12	29.08	Core Variable
3	Year (Numeric)	14.27	11.26	Temporal Trend
4	geo_LU (Luxembourg)	13.70	3.15	Country Dummy
5	GDP per Capita	13.32	6.64	Core Variable
6	Higher Education	10.70	11.55	Core Variable
7	geo_BG (Bulgaria)	8.34	0.85	Country Dummy
8	geo_DK (Denmark)	7.98	1.92	Country Dummy
9	Unemployment Rate	7.97	4.89	Core Variable
10	geo_FI (Finland)	7.61	3.34	Country Dummy

Table 6. Ranking of top predictors by model.

Model	Top 1 Predictor	Top 2	Top 3
Random Forest	ICT Specialists	Broadband Access	GDP per Capita
XGBoost	ICT Specialists	Broadband Access	Year
Elastic Net	Year	ICT Specialists	geo_FI (Finland)

Table 7. Training time for each model.

Model	Training Time (s)	Notes
Random Forest	3.26	500 trees; mtry optimized
XGBoost	8.02	Tuned over nrounds, eta, max_depth
Elastic Net	0.20	glmnet-based regularization (fastest)
SVM (Linear Kernel)	0.14	Tuned C with linear kernel
Ensemble (Stacked)	12.36	Aggregation of base learners via greedy RMSE

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Tudor, C.; Florescu, M.; Polychronidou, P.; Stamatiou, P.; Vlachos, V.; Kasabali, K. Cloud Adoption in the Digital Era: An Interpretable Machine Learning Analysis of National Readiness and Structural Disparities Across the EU. Appl. Sci. 2025, 15, 8019. https://doi.org/10.3390/app15148019

AMA Style

Tudor C, Florescu M, Polychronidou P, Stamatiou P, Vlachos V, Kasabali K. Cloud Adoption in the Digital Era: An Interpretable Machine Learning Analysis of National Readiness and Structural Disparities Across the EU. Applied Sciences. 2025; 15(14):8019. https://doi.org/10.3390/app15148019

Chicago/Turabian Style

Tudor, Cristiana, Margareta Florescu, Persefoni Polychronidou, Pavlos Stamatiou, Vasileios Vlachos, and Konstadina Kasabali. 2025. "Cloud Adoption in the Digital Era: An Interpretable Machine Learning Analysis of National Readiness and Structural Disparities Across the EU" Applied Sciences 15, no. 14: 8019. https://doi.org/10.3390/app15148019

APA Style

Tudor, C., Florescu, M., Polychronidou, P., Stamatiou, P., Vlachos, V., & Kasabali, K. (2025). Cloud Adoption in the Digital Era: An Interpretable Machine Learning Analysis of National Readiness and Structural Disparities Across the EU. Applied Sciences, 15(14), 8019. https://doi.org/10.3390/app15148019

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Cloud Adoption in the Digital Era: An Interpretable Machine Learning Analysis of National Readiness and Structural Disparities Across the EU

Abstract

Featured Application

Abstract

1. Introduction

2. Literature Review

2.1. Macro-Level Predictors of Cloud Adoption

2.2. Human Capital and Digital Skills in Digital Transformation

2.3. Digital Readiness, Macroeconomic Outcomes, and Regional Gaps

2.4. Infrastructure, Connectivity, and Technological Convergence

2.5. Emerging Technologies and Organizational Readiness

2.6. Hypotheses Development

3. Materials and Methods

3.1. Data Preparation

3.1.1. Data Sources

3.1.2. Variable Construction and Data Processing

3.2. Modeling

3.2.1. Random Forest Modeling

3.2.2. Country Segmentation via Hierarchical Clustering

3.3. Result Interpretation

3.3.1. Explainability and Model Interpretation

3.3.2. Partial Dependence and Interaction Effects

4. Results

4.1. Exploratory Data Analysis

4.2. Model Performance and Variable Importance

4.3. Robustness Check

4.3.1. Random Forest with Country Fixed Effects

4.3.2. Econometric Benchmark: System GMM Estimation

4.3.3. Alternative Machine Learning Models and Ensemble Learning

4.4. Explainable AI (XAI) Analysis

4.5. Nonlinear and Interaction Effects

4.6. Country Clustering

4.7. Further Robustness Analysis: Estimations on Panel B (Excluding Broadband Access)

5. Discussion

5.1. Trends in Cloud Adoption

5.2. Key Drivers of Cloud Adoption

5.3. Clustering and Heterogeneity Across EU Countries

5.4. Policy Implications

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

Appendix A

Appendix A.1. Correlation Analysis

Appendix A.2. System GMM Results

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI