Cloud Adoption in the Digital Era: An Interpretable Machine Learning Analysis of National Readiness and Structural Disparities Across the EU
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThis paper examines the differences in cloud computing adoption among EU countries using interpretable machine learning models. It utilizes SHAP (SHapley Additive exPlanations) values to evaluate the contributions of various features, with the goal of identifying both structural and readiness-related factors that affect cloud adoption. The topic is timely and relevant, especially in the context of digital transformation across the EU. The use of interpretable ML models adds value to the empirical analysis. However, certain aspects of the paper should be clarified and improved to enhance readability:
- The primary research contribution should be articulated clearly in the introduction, alongside a justification for the selection of interpretable ML models, such as SHAP, as opposed to traditional methodologies like regression.
- Reorganizing Section 3 into the phases of data preparation, modeling, and result interpretation.
- Provide more detail on: Data preprocessing (e.g., handling missing values, normalization, and encoding).
- Clarify whether feature selection criteria were utilized before modeling.
- Clarify how SHAP values were calculated (e.g., model-agnostic vs. model-specific approach).
- Include SHAP visualizations such as the Summary plot and Feature importance plot.
- Give details about the origins of the dataset and the precise data sources. The paper does not specify the years of data collection or the download links.
- Enhance the link between findings and EU digitalization policies or strategic goals. This connection should be made more explicit, especially in the conclusion or discussion.
- Use larger font sizes and clearer axes and labels.
- Create a performance metrics table comparing models using RMSE, MAE, or other relevant indicators.
- Some sentences require grammatical improvement.
- Include recent studies (2022–2024) on interpretable ML in public policy or digital transformation.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for Authors- The abstract is not coherent. It would be good if authors can write a sentence describing numerical results and improvement over other methods.
- While Random Forests, XGBoost, and SVM are solid choices, the rationale behind choosing these particular models could be better clarified. Are they selected for robustness, accuracy, interpretability, or something else?
- Given the longitudinal nature of the dataset, were any time-aware models (e.g., time-series regressors, panel regression with fixed/random effects) considered? Traditional ML models may not fully capture temporal dependencies unless lag features are explicitly engineered.
- Pattern the motivation behind using this method to explain in the introduction. Why the existing schemes failed? Does no study try to address this aspect before? If yes, this has to be mentioned.
- Authors should provide the comments of the cited papers after introducing each relevant work. Authors also should provide more sufficient critical literature review to indicate the drawbacks of existed approaches. Also need some recent works like (An attention-driven spatio-temporal deep hybrid neural networks for traffic flow prediction in transportation systems, Attention-Driven Graph Convolutional Networks for Deadline-Constrained Virtual Machine Task Allocation in Edge Computing).
- Parameters of network have been enhanced using training data "until the model obtains the maximum accuracy". If this accuracy is the training accuracy, maybe over-fitting has been performed. If this accuracy is the testing accuracy, the system is adjusted over the same subset that is evaluated. A validation subset could be used to optimize the system with different data than the testing data and without performing over-fitting. In addition, it would be interesting to know which range of each parameter has been analyzed."?
- How does the model handle multicollinearity between variables like broadband coverage, ICT density, and education level? Add some works "VMR: virtual machine replacement algorithm for QoS and energy-awareness in cloud data centers".
- The technical details do make much intelligibility, so please provide some strong technical details in the main methodology. The consumed time in training procedure of the proposed method and the compared algorithms can be listed.
-
Did you experiment with ensemble blending (e.g., stacking) to combine model predictions from Random Forest, XGBoost, and SVM for improved accuracy or robustness?
-
Were any country-level fixed effects or regional dummy variables introduced to control for national idiosyncrasies in policy or data quality
Comments on the Quality of English Language
The English could be improved to more clearly express the research.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 3 Report
Comments and Suggestions for AuthorsThe objective off the proposal article is to analyze national-level cloud computing adoption across the EU using explainable machine learning models.
Authors used the following methodologies: Random Forests, XGBoost, and SVM, SHAP and ICE for model interpretability, constructs two harmonized panel datasets (2014–2021 and 2014–2024), Incorporates a clustering analysis to group EU countries based on digital maturity.
We find the Strengths: strong policy relevance and alignment with the EU Digital Decade agenda, use of explainable AI methods enhances both transparency and trust, incorporates both temporal robustness (through dual panels) and cross-sectional insights (through clustering), rich literature foundation and clear hypothesis development
We propose some Potential Improvements: Broaden comparative scope: Including non-EU countries or regions like EFTA could offer deeper insights into structural divergence, Expand methodological depth: Consider including a causal inference layer (e.g., instrumental variables or difference-in-differences if relevant data is available).
The figures mentioned (e.g., PDPs, SHAP plots, clustering dendrogram) are referenced but not embedded in this PDF. Ensure they're included in the final version for clarity.
More detail on how digital skill-building or broadband investment strategies differ among EU countries could strengthen the policy recommendations.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Round 2
Reviewer 2 Report
Comments and Suggestions for AuthorsNo more comments.
Comments on the Quality of English LanguageThe English could be improved to more clearly express the research.