Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Cross-Dataset Data Augmentation Using UMAP for Deep Learning-Based Wind Speed Prediction

Computers 2025, 14(4), 123; https://doi.org/10.3390/computers14040123

by Eder Arley Leon-Gomez^*, Andrés Marino Álvarez-Meza^*

and German Castellanos-Dominguez

Reviewer 1:

Vahid Arabzadeh

Reviewer 2: Anonymous

Reviewer 3:

Duanbing Chen

Reviewer 4: Anonymous

Computers 2025, 14(4), 123; https://doi.org/10.3390/computers14040123

Submission received: 10 February 2025 / Revised: 14 March 2025 / Accepted: 22 March 2025 / Published: 27 March 2025

(This article belongs to the Special Issue Machine Learning and Statistical Learning with Applications 2025)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

Here are my comments

Please add a brief part to abstract and present the quantative values from your results
Your novelty is weak and not explained well. Make it clear, why you have choosed such approach and what exactly you are covering as gap. Explain novelties as clear bullet points.
Be graphical and show the structure of the model,preprocessing steps, exact inputs inputs, layers and expected outcomes clearly.
present your dataset statistically as a table.
How have you tuned up the model parameters?
Be more detail in result analysis , study the effect of the changing the applied models parameters
compare your results with existing similar works, and explain the limitations clearly

Author Response

See attached pdf

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

The paper presents a novel approach for enhancing deep learning-based
wind speed prediction by employing a Neighborhood Preserving Cross-Dataset Data Augmentation (UMAP-CDDA) framework. The study integrates Uniform Manifold Approximation and Projection (UMAP) for dimensionality reduction and utilizes cross-dataset data augmentation to enhance predictive accuracy using recurrent neural networks
(RNNs). The approach is validated on three geographically distinct
datasets from the USA and China, demonstrating its effectiveness in
improving forecasting performance. Overall, this work presents a
well-structured and impactful contribution to wind speed forecasting
and I suggest publication after the following points are taken into
consideration:

Section 4 (Results and Discussion) should be restructured and presented in such a way that the contribution of the
paper becomes more clear. Figure 4 as is, it is very confusing, with
velocity distributions being almost the same for all time windows of
each of the three airports. I am not very sure if that should be the
case. In Figure 5, MAPE seems to be decreasing with increasing
threshold and this does not seem to be clear to me. Then, in Figure
6, MAPE is also increasing with increasing steps and the values reach
order of 100. If that is not %, then it is problematic. Please look
into it. Moreover, please explain the difference between "steps" and
"thresholds". Finally, in Equation 17 of MAPE, there is a number "1",
next to the absolute difference of true minus predicted. Is that an
index or has it been forgotten there?

Author Response

See attached pdf

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

The description should be express clear. Some typo errors should be corrected.

Comments on the Quality of English Language

The authors a novel framework to predict the wind speed. The proposed method includes three core components, (1) UMAP is introduced to reduce the dimensionality of nonlinear wind speed data while preserving the neighborhood structures, (2) cross-dataset data augmentation by using UMAP, and (3) RNNs are trained on augmented datasets. The performance of the proposed method is evaluated by three datasets and compared with basic version of RNN, GRU and LSTM. The research topic is interesting and has some application value. However, there are some problems should be considered further to improve the manuscript.
(1) In Line 250-252, what is the meaning of N, tau and M. Reduce tau time instants to M time instants? The dimensionality of before and after reducing by UMAP should be clarified clearly.
(2) In the experiments, the authors could compare with other state-of-the-art wind speed prediction methods.
(3) In the description of Beijing Capital International Airport dataset, the link of data availability could be given if possible.
(4) The second item in the most left of Fig 3 should be dataset 2
(5) In the Fig 5, the radius should be given in each sub figure.
(6) In fig 6, the curves could be grouped according to MSE, kernel MSE, MAE and UMAP-CDDA, and different line type and line color could be used to distinguish each curve.
(7) Fig 6 should be larger, and the number of the fourth sub figure is wrong.
(8) Line 89, "Support SVM"==>SVM

Author Response

See attached pdf

Author Response File: Author Response.pdf

Reviewer 4 Report

Comments and Suggestions for Authors

The study is interesting and meaningful for wind characterization. Data Augmentation method is upgraded to perform more robust and cost-effective predictions. The contents of the paper are well organized and explained. The methods are described in sufficient detail. The statistical reporting is rigorous and convincing. The English is well written. Comparison of learning performance by SRNN, LSTM, and GRU is indicative. A few comments are given for further improvements of the presentation:

In introduction section, given that uniform manifold approximation and Projection (UMAP) and localized cross-dataset DA approach have been applied in other studies as referenced in the manuscript, the new input from this work should be stressed.
Avoid using the bold type in the texts like “for assessing the generalizability and performance of wind speed prediction approaches.”.
Please check the mathematical operator “◦” in Eq. (9).
For “The training period for the Argonne IL dataset spanned from 1948-01-01 01:00 to 1994-03-14 10:00, while testing covered 1994-03-14 11:00 to 2005-09-30 24:00.”, the training and testing periods are different in decades. A question is: the wind data may have been varied with the time due to the active climate changes in recent years.
In Fig. 4, what is the x- and y-axis?
Please ensure the coloured font in the caption of Fig. 6 could be visible at its publication version.
More wind speed data from predictions should be shown in the result and discussion section, as is presented in Fig. 1.

Article Menu

Cross-Dataset Data Augmentation Using UMAP for Deep Learning-Based Wind Speed Prediction

Further Information

Guidelines

MDPI Initiatives

Follow MDPI