Review Reports
- Alex Fabián Carvajal,
- Alejandro Collazos and
- Ricardo Salazar-Cabrera*
Reviewer 1: Anonymous Reviewer 2: Anonymous Reviewer 3: Gen Li
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThank you to the authors for the opportunity to review their work. Below, I provide comments and observations that I hope will help in further improving the article:
1. The article shows a clear disproportion between the extensive, highly detailed “Data Preparation” section and the relatively superficial analysis of the modeling results. The authors have devoted considerable space to describing preprocessing operations at the expense of substantive and interpretative analysis. In scientific literature, the description of preprocessing should be sufficiently detailed to allow replication, but it should not dominate over the sections concerning performance evaluation, interpretation of results, and conclusions for practice.
2. In the text, the authors repeatedly emphasize that the study considers 10 key stations. The selection of these so-called “key” stations is the foundation of the entire study, as the model was built solely on data from these locations. Unfortunately, this stage is described very briefly, limited to the statement that the 10 stations were chosen after “a process of reviewing and counting the number of validations in a given period”. However, there is no detailed selection criterion, so it is unclear, or can only be guessed”
• whether the choice was based solely on total passenger volume, or whether network importance was also considered (e.g., transfer hubs, stations with high capacity impact),
• whether the station ranking was based on data from a single day, a week, or a longer time horizon, which may lead to the incidental inclusion of stations with temporarily high traffic,
• whether the list of 10 stations is constant over time, or changes depending on season, day of the week, or special events.
3. In addition, the authors did not present any verification of whether the model’s conclusions would be similar if other sets of stations were used (e.g., stations with medium load or different geographical locations). Such a lack of transparency carries the risk of selection bias. The model may be tailored to the specific profile of the chosen stations, making the results non-representative for the entire BRT system. Furthermore, limiting the analysis to the most heavily loaded stations may overestimate or underestimate prediction quality metrics, because traffic patterns at extremely busy stations are often more predictable (regular peaks) than at stations with variable or low load.
4. The article does not indicate whether the proposed model performs well under high fluctuations in demand throughout the day, which occur in urban transport. Therefore, it is unclear to what extent it handles peak travel periods, which are crucial from the perspective of transport system design and organization.
5. The article and the considerations presented within it are poorly visualized. This makes it difficult to understand the reasoning and its potential use in the practical design and organization of BRT systems. For example, there is no visualization of the differences between actual and predicted time series. In predictive studies in public transport, it is common to include: line charts of actual and predicted load over time, heatmaps showing relative errors at different times of the day and days of the week, and histograms or box plots showing error distributions to illustrate the variability of underestimations and overestimations.
6. The article does not present credible sensitivity analyses of the results with respect to the choice of input data (e.g., tests on smaller samples, other time ranges, or with additional variables). This means we do not know whether the obtained results are robust and generalizable, or whether they are the effect of a very specific, narrowly selected dataset. This impacts the universality, and thus the practical usefulness of the presented considerations.
Author Response
"Please see the attachment."
Author Response File:
Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsTopic is interesting, but I have the following questions and suggestions.
(1) The title is misunderstanding. The predicting model can be changed to Prediction.
(2) Much of the content in the abstract is not relevant to the central topic of this paper. Thus, it needs to be rewritten.
(3) Contributions of this paper need to be enriched, and in particular, they need to be summarized in contrast to the current literature.
(4) Routing planning is mentioned several times in this paper. However, analysis in this paper is not relevant to the routing planning.
(5) When the machine learning method is used for the prediction of one station along the BRT line, any possible correlation between different stations are considered? Does this correlation exist or not? This is critical.
(6) Are the results in this paper related to the adopted machine learning methods. There are various ML methods that can be used.
Author Response
"Please see the attachment."
Author Response File:
Author Response.pdf
Reviewer 3 Report
Comments and Suggestions for Authors- The abstract contains a lot of irrelevant content, making it overly lengthy.
- The paper claims that there is no research on passenger flow burden prediction at BRT stations, but in fact, passenger flow prediction at BRT stations has been studied for more than a decade.
- Currently, there are also many literatures on predicting passenger flow at stations (including buses, subways, etc.) based on machine learning or deep learning. Therefore, the authors need to reorganize the literature review.
- The data description section should be separated from the literature review.
- The data description section uses many screenshots, which should be avoided. In addition, the data description section is too lengthy and should be abbreviated to retain only the necessary data description and cleaning process.
- The author describes many machine learning algorithms, but it seems that only two are used in the end. Why?
Author Response
"Please see the attachment."
Author Response File:
Author Response.pdf
Round 2
Reviewer 1 Report
Comments and Suggestions for AuthorsThank you to the Authors for preparing responses to the first-round comments and for the revisions made to the manuscript. I especially appreciate the shortening of the Data Preparation section, the clarification of the criteria for selecting the 10 stations, and the specification of the model’s full daily training range. Some key issues, however, remain unresolved: the lack of a convincing generalization of results beyond the selected set of 10 stations (which are likely to exhibit greater demand predictability than the others), the absence of visualizations that would aid in understanding the model’s behavior, and the lack of robustness tests. These elements are essential for assessing the credibility, transferability, and reproducibility of the results. I also do not share the Authors’ view that adding visual elements would not improve readability.
On the other hand, although the article still has limitations, the Authors clearly acknowledge them, providing readers with the appropriate interpretive context. Consequently, I believe the manuscript meets the criteria for scholarly publications and is suitable for publication.
Reviewer 2 Report
Comments and Suggestions for AuthorsThe revision is good.
Reviewer 3 Report
Comments and Suggestions for AuthorsAll my comments have been addressed