Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

AI-Driven Digital Twin for Optimizing Solar Submersible Pumping Systems

Inventions 2025, 10(6), 93; https://doi.org/10.3390/inventions10060093

by Yousef Salah¹

, Omar Shalash^2,*

, Esraa Khatab³

, Mostafa Hamad¹

and Sherif Imam⁴

Reviewer 1: Anonymous

Reviewer 2: Anonymous

Reviewer 3: Anonymous

Inventions 2025, 10(6), 93; https://doi.org/10.3390/inventions10060093

Submission received: 31 August 2025 / Revised: 19 October 2025 / Accepted: 22 October 2025 / Published: 25 October 2025

(This article belongs to the Special Issue Advanced Technologies and Artificial Intelligence for Sustainable and Intelligent Transportation Systems: Second Edition)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

This study focuses on improving the performance of solar-powered submersible pumps in remote and desert-like regions using an AI-driven digital twin framework. By collecting six months of real-world data, the research develops and tests predictive models to estimate and optimize the system’s power output and water delivery in real time. The results demonstrate that the Random Forest model provides the most accurate predictions, offering a practical solution for reliable water access in areas with limited infrastructure.

Major issues

This remote area is in Behera, Egypt, which has a lot of farming lands that needs water supply for the plants.
In dataset section 3, data were collected in the Khataba region in Elsadat, Beheira Government in Egypt, while in the conclusion it is mentioned, in rural Housh Eissa, Beheira Governorate, Egypt. Which is true, and Housh Eissa, Beheira Governorate, Egypt, is not a rural area.
Structure of the manuscript needs to be revised.
What is the difference between table 5 and table 6, and they are not mentioned in the text.
the conclusion is not clear.

Minor issues

Figure 3 is missing in the text.
The sequence of figures is not convenient, the authors mentioned.
Figure 7 and Figure 8, in line 494 , is it in the right place.
You can’t mention Figure 16 and Figure 17, before Figure 14.
The table caption is before the table not after.
There is a major problem with the sequence if the tables, makes it hard for the reader to follow the meaning.
24 table is a large number of tables in one manuscript

Author Response

Response to Reviewer 1 Comments

Point-by-Point Response to Major Issues

――――――――――――――――――――――――――――――――――――――――――――――――――

1.1 Major Issue 1:

This remote area is in Behera, Egypt, which has a lot of farming lands that needs water supply for the plants.

Response 1: Thank you for your comment. Totally agree, large area farming land needs to. But some areas don’t have a traditional water supply source like rivers. So this research aims to create a model to enable farmers to compute their total water lifted from a water reservoir to ease the possibility of farming in what can be classified as harsh climate areas.

――――――――――――――――――――――――――――――――――――――――――――――――――

1.2 Major Issue 2:

In dataset section 3, data were collected in the Khataba region in Elsadat, Beheira Governorate in Egypt, while in the conclusion it is mentioned, in rural Housh Eissa, Beheira Governorate, Egypt. Which is true, and Housh Eissa, Beheira Governorate, Egypt, is not a rural area.

Response 2:

Thank you for this valuable feedback, and we sincerely apologize for this error. The correct location is:

Khataba region in Elsadat, Beheira Governorate, Egypt and we corrected that in the dataset subsection and the conclusion section

――――――――――――――――――――――――――――――――――――――――――――――――――

1.3 Major Issue 3:

The structure of the manuscript needs to be revised.

Response 3:

Thank you for this valuable feedback. We modify the section order in the following order:

Introduction
Materials and Methods
Results
Discussion
Conclusions

――――――――――――――――――――――――――――――――――――――――――――――――――

1.4 Major Issue 4:

What is the difference between table 5 and table 6, and they are not mentioned in the text.

Response 4:

Thank you for this valuable feedback. We have addressed that.

Clarification of table purposes:

Table 5 (Optimized Hyperparameters for Tree-Based Models - Phase 1): Presents the final hyperparameter configurations for Random Forest, Gradient Boosting, and XGBoost selected after random search with 5-fold cross-validation

Table 6 (Final Optimized Hyperparameters for Neural Network 1): Presents the neural network architecture and training parameters selected after Bayesian Optimization

We wrongly captioned Table 6 as Table 5, and we addressed that

――――――――――――――――――――――――――――――――――――――――――――――――――

1.5 Major Issue 5:

The conclusion is not clear.

Response 5:

Thank you for this valuable feedback. Conclusion rewritten and quantitative validation, acknowledge limitations, and offer research directions were added.

――――――――――――――――――――――――――――――――――――――――――――――――――

1.6 Minor Issue 6:

Figure 3 is missing in the text.

Response 6:

Thank you for the observation. Figure 3 was referenced in the text

――――――――――――――――――――――――――――――――――――――――――――――――――

1.7 Minor Issue 7:

The sequence of figures is not convenient, the authors mentioned.

Response 7:

Totally agree and apologize for the confusing figure organization in the original submission.

Action taken: Figures have been reordered.

――――――――――――――――――――――――――――――――――――――――――――――――――

1.8 Minor Issue 8:

Figure 7 and Figure 8, in line 494, is it in the right place.

Response 8:

Thank you for this observation. Position corrected.

――――――――――――――――――――――――――――――――――――――――――――――――――

1.9 Minor Issue 9:

You can't mention Figure 16 and Figure 17, before Figure 14.

Response 9:

Thank you for this observation. The order of the figures in the text is more appropriate

――――――――――――――――――――――――――――――――――――――――――――――――――

1.10 Minor Issue 10:

The table caption is before the table not after.

Response 10:

Thank you for this correction. We have corrected all table caption placements to follow that.

――――――――――――――――――――――――――――――――――――――――――――――――――

1.11 Minor Issue 11:

There is a major problem with the sequence of the tables, makes it hard for the reader to follow the meaning.

Response 11:

Thank you for your feedback, tables and their sequences have been revised.

――――――――――――――――――――――――――――――――――――――――――――――――――

1.12 Minor Issue 12:

24 tables is a large number of tables in one manuscript first one.

Response 12:

Thank you for your concern. The tables in this manuscript serve essential purposes for scientific rigor and reproducibility. We tried to combine them, but Readability issues exist. Please, if you find a table's existence unnecessary, point it out, and it will be converted to text.

Reviewer 2 Report

Comments and Suggestions for Authors

This manuscript introduces a framework for modeling and optimizing the performance of a solar-powered submersible pump system through an AI-enabled digital twin. The study addresses a timely topic, but several areas require clarification and refinement. My detailed comments are as follows:

Although the framework is conceptually well-motivated, its originality compared to existing digital twin applications in renewable energy is not sufficiently emphasized. The authors should articulate more clearly how their approach differs from earlier AI-driven pump optimization or PV–pump integration studies.
The dataset consists of only six months of measurements from a single location (Khataba, Egypt). Such geographic and temporal restrictions limit generalizability to other climates, terrains, or seasonal variations. The authors should explicitly acknowledge these constraints and, where feasible, consider cross-seasonal validation or synthetic data augmentation.
The comparative evaluation covers Random Forest, XGBoost, Gradient Boosting, Neural Networks, and Linear Regression. However, the absence of simple yet informative baselines—such as persistence models or physics-only hydraulic simulations—makes it difficult to contextualize the ML performance gains. Including at least one empirical or physics-based benchmark would strengthen the study.
The reported metrics (MAE, RMSE, R²) are informative, but the statistical treatment is inconsistent. Confidence intervals, variance over multiple runs with different random seeds, and significance testing (e.g., Wilcoxon signed-rank, Diebold–Mariano) are needed to substantiate claims of model superiority. Overfitting signs in the neural network models also require careful discussion.
The workflow describes sequential prediction (frequency → power → water volume). However, the defining feature of a digital twin—continuous feedback and adaptive control—is not sufficiently explored. The paper would benefit from elaborating on how real-time sensor data could be incorporated into a closed-loop digital twin for adaptive deployment.
While training and inference time are measured, the discussion of deployment feasibility on resource-constrained edge devices is only partial. The authors should expand on the energy and hardware limitations typical of remote desert locations and compare models in terms of cost–benefit trade-offs for deployment.
The mathematical formulations for TDH and flow rate are correct, but the integration with AI predictions remains somewhat abstract. A worked numerical example linking predicted power to water output would enhance interpretability.
At times, the writing style employs overstated expressions such as “consistently outperforms.” These should be toned down and replaced with more evidence-based, cautiously framed language.
The figures, although informative, are visually dense. Simplifying legends, standardizing axis scales, and emphasizing the most salient trends would improve readability.
The related work section mainly discusses general AI and renewable energy studies. To situate this work within the state of the art, more recent applications of digital twins (2022–2025), particularly in water systems and PV-driven pumping, should be reviewed.
The conclusion highlights performance but underplays practical deployment issues such as sensor failure, data latency, and maintenance needs. A dedicated subsection on limitations and future directions, with attention to real-world adoption barriers, would make the study more balanced and credible.

Comments on the Quality of English Language

The study is technically sound and relevant to the journal’s scope. However, the manuscript would benefit from clearer writing, further clarification of methods, and a more explicit discussion of dataset limitations and generalization issues. In addition, several parts of the manuscript would benefit from careful English editing to improve readability and precision.

Author Response

Response to Reviewer 2 Comments

Point-by-Point Response

――――――――――――――――――――――――――――――――――――――――――――――――――

2.1 Comment 1: Although the framework is conceptually well-motivated, its originality compared to existing digital twin applications in renewable energy is not sufficiently emphasized. The authors should articulate more clearly how their approach differs from earlier AI-driven pump optimization or PV–pump integration studies.

Response:

Thanks for your feedback. We have revised the Introduction section and added a paragraph to explicitly compare our approach with prior works in digital twin applications and AI-driven PV-pump optimization. The paragraph starts with “While digital twin applications in renewable energy have”

――――――――――――――――――――――――――――――――――――――――――――――――――

2.2 The dataset consists of only six months of measurements from a single location (Khataba, Egypt). Such geographic and temporal restrictions limit generalizability to other climates, terrains, or seasonal variations. The authors should explicitly acknowledge these constraints and, where feasible, consider cross-seasonal validation or synthetic data augmentation.

Response:

Thank you for highlighting this important limitation. We agree that the geographic and temporal scope of our dataset; six months of measurements from a single location in Khataba, Egypt may constrain the generalizability of our findings to other climates, terrains, or seasonal conditions. To address this, we have revised the manuscript and acknowledged this constrain in the Discussion section (Future Work subsection)

――――――――――――――――――――――――――――――――――――――――――――――――――

2.3 The comparative evaluation covers Random Forest, XGBoost, Gradient Boosting, Neural Networks, and Linear Regression. However, the absence of simple yet informative baselines—such as persistence models or physics-only hydraulic simulations—makes it difficult to contextualize the ML performance gains. Including at least one empirical or physics-based benchmark would strengthen the study.

Response:

Thank you for this valuable suggestion. To provide a clearer context for evaluating the machine learning models, A persistence baseline was included, where the output frequency and power are predicted based on the previous observed values. This baseline has been added to both Phase 1 and Phase 2 analyses.

――――――――――――――――――――――――――――――――――――――――――――――――――

2.4 The reported metrics (MAE, RMSE, R²) are informative, but the statistical treatment is inconsistent. Confidence intervals, variance over multiple runs with different random seeds, and significance testing (e.g., Wilcoxon signed-rank, Diebold–Mariano) are needed to substantiate claims of model superiority. Overfitting signs in the neural network models also require careful discussion

Response:

Thank you for your valuable feedback on statistical treatment. To address this point, confidence intervals for all models have now been added in the revised manuscript. These intervals were computed using 95% bootstrapped resampling to quantify uncertainty and ensure consistency in comparative evaluation.

Regarding variance across random initializations, model results were averaged over multiple training runs with different random seeds, and the observed variance was minimal, further reinforcing the stability of model performance.

As for overfitting, the neural network model exhibited comparable R², RMSE, and MAE values across training, validation, and testing datasets (R² ≈ 0.955 for all), indicating no substantial overfitting.

――――――――――――――――――――――――――――――――――――――――――――――――――

2.5 The workflow describes sequential prediction (frequency → power → water volume). However, the defining feature of a digital twin—continuous feedback and adaptive control—is not sufficiently explored. The paper would benefit from elaborating on how real-time sensor data could be incorporated into a closed-loop digital twin for adaptive deployment.

Response:

Thank you for this insightful feedback, the future work subsection has been modified to include emphasis of how real-time sensor data could be incorporated into a closed-loop digital twin architecture.

――――――――――――――――――――――――――――――――――――――――――――――――――

2.6 While training and inference time are measured, the discussion of deployment feasibility on resource-constrained edge devices is only partial. The authors should expand on the energy and hardware limitations typical of remote desert locations and compare models in terms of cost–benefit trade-offs for deployment.

Response:

Thank you for the valuable feedback. A subsection in the discussion has been added: “Deployment Feasibility in Resource-Constrained Desert Environments”. It includes a comparative analysis of the energy and hardware limitations typical of such settings.

――――――――――――――――――――――――――――――――――――――――――――――――――

2.7 The mathematical formulations for TDH and flow rate are correct, but the integration with AI predictions remains somewhat abstract. A worked numerical example linking predicted power to water output would enhance interpretability.

Response: Thank you for your comment. Please note that, equations 2, 3, 4 and 5 are the general equations required for the model integration where P is the input from our developed models and Q is the output. However equations 6, 7, 8 and 9 are the numerical example for the general equations based on water reservoir data.

――――――――――――――――――――――――――――――――――――――――――――――――――

2.8 At times, the writing style employs overstated expressions such as “consistently outperforms.” These should be toned down and replaced with more evidence-based, cautiously framed language.

Response:

We appreciate this valuable suggestion. We have revised the manuscript for measured evidence-based language.

Examples of changes:

"consistently outperforms" → "demonstrated superior performance across all evaluation metrics"

――――――――――――――――――――――――――――――――――――――――――――――――――

2.9 The figures, although informative, are visually dense. Simplifying legends, standardizing axis scales, and emphasizing the most salient trends would improve readability.

Response:

Thank you for your observation, figures have been revised.

――――――――――――――――――――――――――――――――――――――――――――――――――

2.10 The related work section mainly discusses general AI and renewable energy studies. To situate this work within the state of the art, more recent applications of digital twins (2022–2025), particularly in water systems and PV-driven pumping, should be reviewed.

Response:

Thanks for your feedback. We have revised the Introduction section and added a paragraph to explicitly compare our approach with prior works in digital twin application and AI-driven PV-pump optimization. The paragraph starts with “While digital twin applications in renewable energy have”

――――――――――――――――――――――――――――――――――――――――――――――――――

The conclusion highlights performance but underplays practical deployment issues such as sensor failure, data latency, and maintenance needs. A dedicated subsection on limitations and future directions, with attention to real-world adoption barriers, would make the study more balanced and credible.

Response:

Thank you for your feedback. The Future Work has been revised and now includes limitations and suggested future research directions.

Reviewer 3 Report

Comments and Suggestions for Authors

Although including experimental work, the paper is written in a disorganized manner, which distracts the reader. In addition, many parts seem to be written using a generative AI tool(s). The paper is missing a critical literature review to well-situate the contributions against a multitude of studies about water pumping systems optimization. Many parts are not referenced. The paper is excessively long without clear contributions. The authors should consider the following detailed comments. They should also prevent stuffing.

The authors should include more numerical results in the abstract, mainly related to the volume of water lifted based on the system’s operational parameters.
The paragraph from Line 83 to Line 99 should be referenced/justified.
The authors are highly recommended to conduct a causality/correlation analysis between the model features and output. Linear/Nonlinear correlation metrics such as Pearson, Spearman, and Kendall. In addition, the redundancy of features should be considered carefully since if two features are mutually correlated, one of them should be excluded to prevent redundancy.
What is the difference between the data partitioning sections (L83-L99) and (L344-L357)?

Comments on the Quality of English Language

The English should be revised carefully.

Author Response

Response to Reviewer 3 Comments

Concern: "Paper written in a disorganized manner" and "excessively long without clear contributions"

Response:

Thank you for this valuable feedback. We modify the section order in the following order:

Introduction
Materials and Methods
Results
Discussion
Conclusions

--------------------------------------------------------------------------------

Concern: "Many parts seem to be written using generative AI tool(s)"

Response:

Thank you for your concern. The manuscript has been revised and checked for AI-generated text.

------------------------------------------------------------------------------

Concern: "Missing critical literature review to well-situate contributions"

Response: Thank you for your comment. Literature review revised.

――――――――――――――――――――――――――――――――――――――――――――――――――

3.1 The authors should include more numerical results in the abstract, mainly related to the volume of water lifted based on the system’s operational parameters.

Response:

Excellent suggestion. We have enhanced the abstract with quantitative water volume results.

――――――――――――――――――――――――――――――――――――――――――――――――――

3.2 The paragraph from Line 83 to Line 99 should be referenced/justified.

Response:

Thank you for your notice. References were added to this paragraph.

――――――――――――――――――――――――――――――――――――――――――――――――――

3.3 The authors are highly recommended to conduct a causality/correlation analysis between the model features and output. Linear/Nonlinear correlation metrics such as Pearson, Spearman, and Kendall. In addition, the redundancy of features should be considered carefully since if two features are mutually correlated, one of them should be excluded to prevent redundancy.

Response:

Thank you for this valuable feedback. In the revised manuscript, we have included a comprehensive correlation and multicollinearity analysis to assess feature relationships and redundancy. Specifically, Pearson correlation coefficients between all input features and the target variable, while the complete feature-feature correlation matrix is presented. To further verify feature independence, the Variance Inflation Factor (VIF) values were computed for all features, all of which were below 5—confirming the absence of severe multicollinearity.

.――――――――――――――――――――――――――――――――――――――――――――――――――

3.4 Comment 4: What is the difference between data partitioning sections (Lines 83-99) and (Lines 344-357)?

Response:

Thank you for identifying this confusion.

Clarification:

L83–L99 → Explains what the data are and where they come from.

L344–L357 → Explains how the dataset was split and used in the modeling process.

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

thanks to the authors for the effort of enhancing the manuscript, however there are a few comments left

the conclusion should be before future work

Author Response

The conclusion should be before future work

Response 1:

Thank you for the comment. We have moved the Future Work section to appear as the last part of the Conclusion, as suggested.

Reviewer 2 Report

Comments and Suggestions for Authors

The revised manuscript shows meaningful improvement and careful attention to prior feedback. The overall flow and clarity have been strengthened, and the paper is now in good shape. The authors’ effort is appreciated.

Author Response

Thank you for your guidance and efforts

Reviewer 3 Report

Comments and Suggestions for Authors

The authors didn't properly address the majority of my comments:

The paper is still excessively long.
Rationale behind selecting the tested ML algorithms is missing.
Nonlinear correlation is not considered.
Generative-AI tools continue to be used (Example: newly added paragraph Line 40 to Line 52).

Comments on the Quality of English Language

English needs to be polished.

Author Response

3.1 The paper is still excessively long.

Response:

Thank you for this valuable comment. We have reduced the manuscript by removing 8 redundant figures (residual distributions, absolute error distributions, and timing comparison plots) and 1 theoretical complexity table. All redundant visualizations that duplicated information already presented in tables or other figures have been eliminated.

3.2 Rationale behind selecting the tested ML algorithms is missing.

Response:

Thank you for this valuable comment. The rationale behind selecting the tested ML algorithms has been added.

3.3 Nonlinear correlation is not considered.

Response:

Thank you for this valuable comment. The nonlinear correlation analysis has been added.

3.4 Generative-AI tools continue to be used (Example: newly added paragraph Line 40 to Line 52).

Response:

Thank you for your concern. The manuscript has been revised.

Round 3

Reviewer 3 Report

Comments and Suggestions for Authors

Nothing to add.

Article Menu

AI-Driven Digital Twin for Optimizing Solar Submersible Pumping Systems

Further Information

Guidelines

MDPI Initiatives

Follow MDPI