Review Reports
- João Batista Firmino Junior*,
- Francisco Dantas Nobre Neto and
- Bruno Neiva Moreno
- et al.
Reviewer 1: Robert Abbas Reviewer 2: Anonymous Reviewer 3: Abhishek Gupta Reviewer 4: Anonymous Reviewer 5: Silin Zhang
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsN/A
Author Response
Thank you for your review. Attached you will find each question itemized, along with our answers.Author Response File:
Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsThis paper presents a comparative study of Markov Chains (MC) and Hidden Markov Models (HMM) for destination prediction using IoT-enabled vehicular trajectory data. The work is relevant to the fields of intelligent transportation systems and Smart Cities. A key contribution is the proposed Smart Sampling with Data Filtering (SSDF) methodology, which aims to provide a replicable framework for evaluating probabilistic models. The use of a real-world dataset (VED) and the application of inferential statistical testing (t-test) across multiple vehicles strengthen the empirical validation. The findings, indicating a slight but statistically significant advantage for HMM (61% vs. 59% precision) and significant improvement in 78.3% of cases with contextual information, are interesting. However, several major aspects require clarification and improvement before the manuscript can be considered for publication.
- The description of the HMM implementation is confusing and seems potentially mis-specified. The paper states that the origin grid cells are treated as the hidden states, and the day of the week is treated as the observation. In a standard HMM for trajectory prediction, the hidden state typically represents the latent destination or a latent mode of travel, while the observed emissions are the sequence of locations or sensor readings. Treating the known origin as a hidden state and the day as a single, static observation for an entire trajectory deviates from conventional use. The authors must provide a much clearer rationale for this model structure, explain the sequence of observations for the Viterbi algorithm, and justify why this architecture is appropriate for the destination prediction task compared to more standard formulations.
- The final analysis is based on data from only 23 vehicles. While the data balancing algorithm is a positive step, the small sample size (N=23) for the paired t-tests raises concerns about the statistical power and the generalizability of the findings. The authors should discuss the limitations associated with this sample size. A power analysis, or at least a discussion of the effect size and its practical significance alongside the statistical significance, would greatly strengthen the claims. The conclusion that HMM is "better" based on a 2% absolute difference in mean precision with N=23 needs to be tempered with these considerations.
- The paper introduces several critical parameters and thresholds without sufficient justification. For instance: (1) The formulas for markov_adequacy and hmm_adequacy use constants (10 and 5 in the denominators). What is the theoretical or empirical basis for these specific values? (2) The segmentation parameters (100m max displacement, 30min min stopping time, 1000m min subtrajectory length) are stated but not justified. How sensitive are the results to these choices?
- The only contextual feature incorporated is the day of the week. For urban mobility, time of day (e.g., rush hour), weekdays vs. weekends, and potentially historical traffic conditions are likely to be equally or more important. The authors should discuss this limitation and explain why the study was restricted to a single, coarse-grained contextual variable. The future work section mentions this, but its acknowledgment as a limitation of the current study is necessary.
- The discussion focuses on the instances where HMM performed better or similarly to MC. A more balanced discussion should also analyze the cases where MC outperformed HMM (e.g., Vehicles 181, 185, 292, 450) or where the difference was not significant. What might be the characteristics of these vehicles or their trajectories that make the simpler MC model sufficient or even preferable? This analysis is crucial for understanding the trade-offs and guiding model selection in practice.
- While the choice of Precision is justified, relying on a single metric can be limiting. Reporting complementary metrics like Recall or F1-score would provide a more holistic view of performance, especially regarding the models' ability to capture all correct destinations. Furthermore, the paper lacks a simple baseline model (e.g., a "most frequent destination" predictor). Including such a baseline would help contextualize the 59-61% precision achieved by the proposed models and better demonstrate their value.
Author Response
Thank you for your review. Attached you will find each question itemized, along with our answers.
Author Response File:
Author Response.pdf
Reviewer 3 Report
Comments and Suggestions for Authors- In the introduction, add relevant citations where existing knowledge is reproduced.
2. In the introduction, add two subsections titled "Contributions" and "Organization".
3. In section 2, there is only one subsection 2.1. Use subsections if there are more than one. The current section 2 can be categorized into 3-4 subsections.
4. Section 3 is Background, which implies this is a repetition of existing knowledge and no new knowledge is presented here. As such, more citations are required in this section but there are very few citations in section 3. Please correct.
5. Figure 4 needs more elaborate description. How were the values calculated ?
6. Add a table that lists all the symbol used in the manuscript.
7. Algorithm in appendix can be merged in section 4.
8. In section 5, there are only bar charts. Please add more visualization to enhance the readability of the paper.
9. Tables 9 and 10 have been discussed in just 1-2 sentences. What is to be interpreted from tables 9 and 10 ?
10. Table 11 has been discussed in just 1-2 sentences. What is to be interpreted from table 11 ?
11. Section 6 can be merged as a subsection in section 5.
12. Although the introduction and related work sections provide a broad overview of the paper, they do not sufficiently highlight the challenges that remain in current approaches or the precise gaps that this paper seeks to address. Clearly articulating how the proposed framework advances beyond existing studies and emphasizing its specific advantages would better contextualize the contributions and underscore the novelty of this work.
13. Please ensure adherence to formula writing conventions, with punctuation marks placed after each equation.
14. Inconsistencies are noted in the reference list, especially in journal title capitalization and formatting. The references should be carefully reviewed and corrected to ensure consistency, accuracy, and completeness.
15. Some of the main concerns are: i) the lack of structure of the manuscript, ii) the presence of lengthy and confused text parts mostly in Sects. 1-4, and iii) the qualitative comparisons against other methods. All these issues, mostly those at point i) and ii), make very difficult the comprehension, and very confused and unclear the reading of the manuscript.
16. There is no comparison of the results with recent literature. This makes very difficult to gauge the novelty of the proposed approach.
17. The authors have not highlighted how does the proposed approach benefit the domain of intelligent transportation systems ?
18. In addition, it is not clear to the reader which parts are new and which parts are already known. The Authors must clearly state which parts are new, just incremental, or known. The latter parts should be reduced or summarized to improve clarity.
19. The Markov theory is well-known and some figures and equations of Sects. 3 and 4 can be simplified or summarized. Also check for any missing term definitions. Moreover, the definitions of terms, symbols and parameters is spread in several tables thus making very difficult the understanding of the mathematical parts.
20. In addition, while some mathematical terms are missing in these tables, some other terms employed in the equations are never defined or referenced in the tables. For these reasons, the Authors are suggested to check all equations, figures and tables of Sects. 3 and 4.
21. Some figures (e.g., Figs. 1, 2, 3) are confused and uninformative. On the contrary, the simulations of Sect. 5 are interesting even if the comparisons with other methods is not done. A detailed comparison against other methods is recommended. In fact, the lack of a true performance comparison against other methods, strongly reduced the appealing of the manuscript.
22. The manuscript is too long and verbose. It should be reduced by eliminating or summarizing redundant parts referring to works that have been already published.
23. A slightly expanded discussion of the practical implications of the findings would be beneficial.
24. A more prominent discussion on the key assumptions made and their potential impact on the generalizability of the results would be valuable.
25. The font size used in the result figures should be larger. In addition, the image quality is not good when zoomed in, so consider replacing them with vector graphics.
Author Response
Thank you for your review. Attached you will find each question itemized, along with our answers.
Author Response File:
Author Response.pdf
Reviewer 4 Report
Comments and Suggestions for AuthorsThis study enhances intelligent transportation within constrained vehicle IoT scenarios by comparing lightweight probabilistic models: Markov chains versus Hidden Markov Models. It proposes the SSDF (Smart Sampling and Data Filtering) process to improve experimental reproducibility and computational efficiency, contributing to transforming urban environments into IoT-driven smart cities. The research demonstrates strong positioning and clear application prospects (edge deployment in smart cities). However, the paper contains the following issues:
(1) The statement “the emission matrix must be square with row sums equaling 1” is incorrect. It suffices that the transition matrix and the emission matrix satisfy the conditions; thus, the emission matrix need not be square.
(2) Jointly encoding “day of the week” with “starting grid” results in an extremely large and sparse observation space. The authors should clarify the sparse data processing strategy (e.g., smoothing, merging, dimensionality reduction) employed.
(3) The paper does not explicitly state how the p-value is calculated (is it based on fold accuracy, average accuracy per vehicle, or per trajectory?).
(4) The authors' K-fold cross-validation produces folds that are not independent (training and test sets overlap). Therefore, directly using the fold number as the sample size for t-tests is inappropriate. We recommend adopting vehicle-level paired tests or mixed-effects models to obtain statistically reliable and interpretable significance conclusions.
(5) Individual behavior varies significantly across vehicles (high heterogeneity). The experimental section's sample size of 23 vehicles is relatively small. More independent vehicles are needed as observation units to average individual differences and obtain more generalizable conclusions.
(6) For all formulas presented in the paper, authors should provide detailed definitions of every variable and incorporate these variables into the formulas using abbreviated symbols. Additionally, authors must explicitly state the origins of parameters within the formulas (e.g., 10, 5, 0.5, 0.7, 0.3), providing corresponding literature references or cross-validated analysis results.
(7) Furthermore, this study on endpoint model prediction within an urban IoT context should not only focus on prediction accuracy but also comprehensively consider model inference time and latency per prediction. It is recommended that the authors supplement their work with relevant research to enhance the comprehensiveness of their research framework.
(8) In addition, some recent works for IoV should be discussed such as such as Digital twin-assisted intelligent secure task offloading and caching in blockchain-based vehicular edge computing networks, Security issues in Internet of Vehicles (IoV): A comprehensive survey.
Comments on the Quality of English Language
Should be improved
Author Response
Thank you for your review. Attached you will find each question itemized, along with our answers.
Author Response File:
Author Response.pdf
Reviewer 5 Report
Comments and Suggestions for AuthorsThis paper (IoT-3950651) focuses on the IoT-based smart city mobility scenario, compares the performance of the Markov Chain and Hidden Markov Model in urban travel destination prediction, and proposes a Smart Sampling with Data Filtering method to improve model reproducibility and computational efficiency. The research topic has innovation and practical value. However, I think the following several aspects need to be clarified.
- Although the related work section presents a comparison of representative studies through tables, it merely lists algorithms, models, and evaluation metrics without analyzing the limitations of existing Markov-based models in traffic trajectory prediction. Moreover, authors don’t clearly explain how this study fills these research gaps, which weakens the persuasiveness of the paper’s claimed contributions.
- The principles of the Markov Chain and HMM are mentioned in Section 3, but the authors don’t explain how the state space size, transition matrix, emission matrix and initial probabilities are determined. The absence of these details makes the model difficult to reproduce.
- The authors state that one of this paper’s innovations is the proposed SSDF method. However, the current explanation of its principle and significance is insufficient, lacking the comparison with other sampling strategies.
- On page 19, lines 554–555, it is stated that “tessellation applied to 100-square-meter grids”. Please elaborate the reasons for selecting 100 square-meter. Why is it a suitable choice for the traffic scenario of the Ann Arbor city? What are its benefits in terms of trajectory partitioning and computational efficiency over other grid sizes?
- In Figures 6, 7, 8 and12, the text labels are too small. Please adjust them to ensure readability.
- The SSDF method uses an automatic threshold selection algorithm to determine 50 trajectory sequences per vehicle as the optimal filtering threshold. How does a lower or higher threshold affect the performance of the MC and HMM models?
- Except for “day-of-week” feature, the paper does not introduce any other interpretable or discriminative spatiotemporal features. Such a single-feature input may limit the model’s prediction ability and prevent the HMM’s advantages from being fully realized.
- It is mentioned multiple times that MC and HMM are computationally efficient on the IoT device. However, in the experiments section, the authors don’t provide any quantitative results to prove that point. Please compare the training time, inference time, memory usage, and prediction latency of the two models. These data are crucial to demonstrate their feasibility for real-world deployment.
- Mean precision as the only metric to evaluate model performance is a little one-sided. It is recommended to include additional metrics such as recall, F1-score, confusion matrix, or prediction bias distribution to more comprehensively reflect model performance. Especially in trajectory prediction tasks, these metrics are more interpretable and informative than accuracy.
Author Response
Thank you for your review. Attached you will find each question itemized, along with our answers.
Author Response File:
Author Response.pdf
Round 2
Reviewer 1 Report
Comments and Suggestions for AuthorsN/A
Comments on the Quality of English LanguageN/A
Author Response
We thank Reviewer 1.
Reviewer 2 Report
Comments and Suggestions for AuthorsThe authors have revised the paper carefully according to the reviewer's comments, and therefore I recommend acceptance for publication.
Author Response
We thank Reviewer 2 for approving the manuscript and acknowledging the improvements implemented.
Reviewer 3 Report
Comments and Suggestions for Authors1. It is evident that the authors have addressed most of the comments.
2. The results have been discussed in detail but a comparison of the results with recent works published in last 1-2 years can be presented in a Table in section 5 or 6.
3. In section 2, in first paragraph, there are "endash (—)" used, for example in "provided an overview of methodological trends and revealed a noticeable lack of studies performing direct statistical comparisons between probabilistic models—particularly Markov Chains and Hidden Markov Models—within IoT-based mobility contexts.
Again, it appears in "However, a notable gap was identified regarding pure Markovian techniques—that is, approaches not combined with more recent methods. Furthermore, there was an absence of foundational analyses examining how and to what extent contextual factors influence model accuracy, suggesting the need to mitigate these effects through appropriate sampling strategies.
Similarly, in next paragraph, "data balancing conditions, which creates a gap in understanding how contextual variables - such as day of the week - affect predictive performance. This distinction motivates the comparative framework proposed in this study."
Throughout the paper, this "endash (—)" appears a lot of times.
The esteemed editor might agree with me that this "endash (—)" only appears in ChatGPT generated text. In regular English language, we use comma(,), full stop(.), semicolon(;), and hyphen(-) but not "endash (—)" .
The authors should disclose if the text is generated using ChatGPT/LLMs. Also, remove all the occurrences of "endash (—)" throughout the manuscript.
4. In view of comment 3, I rest it with the esteemed reviewer that if suspected AI writing is confirmed, what should be the decision on the manuscript ?
5. Lastly, the authors are advised to use readable color font in the manuscript. The bright green font is not conducive to reading and is uncomfortable to read.
Author Response
Comment: "It is evident that the authors have addressed most of the comments."
Response: We thank Reviewer 3 for acknowledging the work performed.
Comment: "The results have been discussed in detail but a comparison of the results with recent works published in last 1-2 years can be presented in a Table in section 5 or 6."
Response: We added Table 12 in Section 6.1 ("Comparison with Recent Approaches") with a systematic comparison of 11 recent studies published between 2023-2025.
Comment: "Throughout the paper, this 'endash (—)' appears a lot of times... The authors should disclose if the text is generated using ChatGPT/LLMs. Also, remove all the occurrences of 'endash (—)' throughout the manuscript."
Response:
ACTION 1: All em-dashes (—) were removed and replaced with standard punctuation.
ACTION 2: Complete disclosure provided in Section 4.6 "Language Assistance" (lines 900-905) and Acknowledgments (lines 1407-1410), clarifying that Claude AI was used EXCLUSIVELY for translation and stylistic enhancement. All scientific content was developed by the authors.
Comment: "In view of comment 3, I rest it with the esteemed reviewer that if suspected AI writing is confirmed, what should be the decision on the manuscript?"
Response: As demonstrated in R3.3, we provide complete and transparent disclosure about AI usage limited to language assistance, in compliance with MDPI policies.
Comment: "Lastly, the authors are advised to use readable color font in the manuscript. The bright green font is not conducive to reading and is uncomfortable to read."
Response: The submitted version is entirely in black, strictly following the MDPI template. The version with track changes showing reviewer-specific and task-specific revisions is marked in blue for visualization purposes only.
Author Response File:
Author Response.pdf
Reviewer 4 Report
Comments and Suggestions for AuthorsThe author has carefully addressed the reviewers' comments in the revised manuscript. The main issues have been resolved, and the paper shows significant improvements in the theoretical and algorithm . This research demonstrates a certain degree of novelty and practical value. However, the paper still has the following four issues:
- Forrelated works, the recent works employing DRL using MDP modeling are still missing, namely Digital twin-assisted intelligent secure task offloading and caching in blockchain-based vehicular edge computing networks, Security issues in Internet of Vehicles (IoV): A comprehensive survey.
- For Section 4.5.3,the “independent precision measurements” description for K-fold cross-validation lacks rigor. Since all folds come from the same vehicle trajectory dataset, the 10 measurements may have partial repetition and are not strictly independent samples. Authors should optimize the expression of 'independent' and clarify this data structure's impact on statistical tests.
- For Section Experiments, authors provided details on vehicle trajectory distribution, but screening logic lacks rigor. The “the more repetitions the better the performance”claim introduces result bias, risking sample distortion. Add objective criteria (e.g., trajectory thresholds, observation periods, GPS metrics) and confirm criteria were defined pre-sampling, not post-observation.
- It is recommended that the variables in the formulas be expressed using concise symbolic abbreviations rather than full textual names, in order to improve clarity and ensure consistency in mathematical notation.
Should be improved
Author Response
Comment: "The author has carefully addressed the reviewers' comments in the revised manuscript. The main issues have been resolved, and the paper shows significant improvements in the theoretical and algorithm. This research demonstrates a certain degree of novelty and practical value. However, the paper still has the following four issues:"
Response: We thank Reviewer 4 for acknowledging the improvements and for the positive assessment regarding novelty and practical value. We carefully addressed each of the four issues raised.
Comment: "For related works, the recent works employing DRL using MDP modeling are still missing, namely Digital twin-assisted intelligent secure task offloading and caching in blockchain-based vehicular edge computing networks, Security issues in Internet of Vehicles (IoV): A comprehensive survey."
Response:
We created a new paragraph (lines 262-278) to cover the topic of IoV, as recommended by the reviewer. Moreover, we included the two references suggested by the reviewer to support this section.
- Reference [25]: Taslimasa et al. 2023 - "Security Issues in Internet of Vehicles (IoV): A Comprehensive Survey"
- Reference [26]: Xu et al. 2024 - "Digital-Twin-Assisted Intelligent Secure Task Offloading and Caching in Blockchain-Based Vehicular Edge Computing Networks"
Comment: "For Section 4.5.3, the 'independent precision measurements' description for K-fold cross-validation lacks rigor. Since all folds come from the same vehicle trajectory dataset, the 10 measurements may have partial repetition and are not strictly independent samples. Authors should optimize the expression of 'independent' and clarify this data structure's impact on statistical tests."
Response: We clarify this issue in Section 4.5.3 (lines 876-883) and Section 5.2 (lines 1031-1047), explaining that "independent" refers to the statistical independence of the training/test partitions (which are temporally disjoint), and not to the absence of recurring spatial patterns, necessary for the algorithm's learning, but differing in terms of timestamp. In case of any doubts or imprecisions, please consult the alternative version of the manuscript PDF containing the changes marked per Reviewer.
Comment: "For Section Experiments, authors provided details on vehicle trajectory distribution, but screening logic lacks rigor. The 'the more repetitions the better the performance' claim introduces result bias, risking sample distortion. Add objective criteria (e.g., trajectory thresholds, observation periods, GPS metrics) and confirm criteria were defined pre-sampling, not post-observation."
Response: We have clarified this in Section 4.4 (lines 774-786). The threshold of 50 trajectories was determined through objective mathematical criteria PRE-sampling via Algorithm 1, not by observed performance, thereby avoiding selection bias. In case of any doubts or imprecisions, please consult the alternative version of the manuscript PDF containing the changes marked per Reviewer.
Comment: "It is recommended that the variables in the formulas be expressed using concise symbolic abbreviations rather than full textual names, in order to improve clarity and ensure consistency in mathematical notation."
Response: We have added Table A1 "Notation" in Appendix A consolidating all mathematical notation. Formulas have been revised with concise notation (o, d, w). Furthermore, throughout the text, we identified terms that did not meet the required level of conciseness, which have been corrected accordingly.
Author Response File:
Author Response.pdf
Reviewer 5 Report
Comments and Suggestions for AuthorsThanks for the authors' revision. The paper was improved a lot.
I ensured that the authors addressed the concerns raised appropriately and improved the paper. I am satisfied with the current version and would like to recommend it to be accepted for publication.
Author Response
We thank Reviewer 5 for approving the manuscript and acknowledging the improvements implemented.
Round 3
Reviewer 3 Report
Comments and Suggestions for AuthorsThe authors have addressed all the comments.
The use of Claude AI for for translation and stylistic enhancement has also been disclosed, which is appreciated.
I have no further comments.
Reviewer 4 Report
Comments and Suggestions for AuthorsAccept
Comments on the Quality of English LanguageShould be improved