Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Advancing Sustainable Mobility: Artificial Intelligence Approaches for Autonomous Vehicle Trajectories in Roundabouts

Sustainability 2025, 17(7), 2988; https://doi.org/10.3390/su17072988

by Salvatore Leonardi^1,*

, Natalia Distefano¹

and Chiara Gruden²

Reviewer 1: Anonymous

Reviewer 2: Anonymous

Reviewer 3: Anonymous

Sustainability 2025, 17(7), 2988; https://doi.org/10.3390/su17072988

Submission received: 26 February 2025 / Revised: 25 March 2025 / Accepted: 26 March 2025 / Published: 27 March 2025

(This article belongs to the Special Issue Next-Generation Roads: The Impact of Artificial Intelligence and Virtual Reality on the Planning, Design and Management of Sustainable Infrastructures)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

Dear Authors,

Your paper presents an interesting and well-organized study on AI-based trajectory planning in roundabouts for autonomous vehicles. The manuscript is well-structured, and the English is clear and fluent, making it easy to follow the key ideas and findings. The methodology is sound, and the comparative analysis of different machine learning models provides valuable insights into their applicability in this complex domain.

However, several aspects need to be addressed before the paper can be considered suitable for publication. These include improving the clarity of certain methodological choices, addressing potential overfitting concerns, enhancing the evaluation metrics, and avoiding redundancy in some sections. Additionally, minor adjustments, such as ensuring acronyms are defined only once and improving figure descriptions, would further strengthen the paper.

Below, I have provided specific comments and suggestions to help refine the manuscript. I hope these recommendations will improve the clarity, rigor, and overall impact of the study.

The authors state that "roundabouts present unique challenges due to their geometric complexity, varying traffic density, and the need for AVs to respond to human-driven vehicles in real time." However, providing a clearer distinction between these challenges and those encountered in other types of intersections or road facilities would be beneficial. A more explicit comparison would strengthen the motivation for focusing on roundabouts and highlight their specific difficulties in the context of autonomous vehicle navigation.

The statement, "However, existing approaches to trajectory planning often struggle to account for the stochastic nature of real-world driving, where human drivers exhibit different behaviors depending on external factors such as road conditions, visibility, and congestion," would benefit from the inclusion of references to support this claim. Specifically, citations of studies that explicitly discuss the limitations of traditional trajectory planning approaches in handling stochastic driving behaviors would strengthen the argument and provide a more solid foundation for the presented work.

The statements in lines 63–68 require appropriate references to support the claims made. Adding citations to relevant studies will enhance the credibility of the discussion and provide a stronger foundation for the presented arguments.

The statement, "The study aims to demonstrate that Neural Network models can outperform conventional and other Machine Learning models in capturing complex nonlinear relationships inherent in human driving data," is redundant, as it is both implicitly included in the main objective and explicitly addressed in the following sentence: "Through a comparative analysis of different AI techniques, this study aims to determine which models best deal with the non-linearity of vehicle movement and provide the most reliable trajectory predictions." To improve clarity and avoid redundancy, the authors should remove the statement about Neural Networks in lines 75–76.

The statement in lines 176–181 is redundant and should be removed, as it largely reiterates key points already discussed in the literature review. The integration of human driving behavior models, real-time adaptability, and robust control mechanisms, as well as the potential of advanced machine learning methods, have been addressed earlier in the manuscript. Removing this section would improve conciseness and avoid unnecessary repetition.

The term "often" in the statement, "Despite its simplicity, it often provides reliable results and serves as a benchmark for evaluating more complex models," suggests that this claim is based on findings from previous studies. To support this assertion, the authors should include relevant references to studies that have demonstrated the reliability of this approach as a benchmark in trajectory planning and related fields.

If the use of ChatGPT-4 for dataset processing, model parameter tuning, and efficiency enhancement represents a novel methodological contribution, the authors should explicitly state this. Conversely, if similar applications of ChatGPT-4 in supporting machine learning model development have been explored in previous studies, appropriate references should be added to contextualize this approach within existing research. Providing citations or a statement would help clarify whether this integration is an innovative aspect of the study or aligns with prior methodologies, respectively.

The statement, "To ensure the robustness of the model, a 5-fold cross-validation was performed," requires a reference to justify the choice of 5-fold cross-validation, as the more commonly used approach is 10-fold cross-validation. The authors should cite relevant studies that support the effectiveness of 5-fold cross-validation in similar contexts or provide a brief rationale for selecting this approach over the standard 10-fold method.

Figure 1 and its description do not accurately represent the Random Forest algorithm. The term "random" in Random Forest signifies the use of bootstrap aggregation (bagging) and feature randomness, meaning that each individual decision tree (CART) is not necessarily trained with all input features. The current depiction suggests that every tree receives all features, which contradicts the fundamental principle of feature selection in Random Forests. The authors should revise the figure and its explanation to align with the correct methodology. For further details on this aspect, they may refer to https://doi.org/10.1080/17457300.2022.2075397

In Figure 3, the "Gradient Boosting Regression (GBR): Simplified Diagram" presents only three stages, while the description states that there are 100 stages in the GBR process. This inconsistency should be addressed in Figure 3.

The authors should explicitly acknowledge that the Artificial Neural Network (ANN) used in the study is a Multilayer Perceptron (MLP) deep neural network. Specifying this will provide readers with a more precise understanding of the model architecture and its applicability to the problem.

The evaluation of models in this study is currently based only on numerical concise metrics, specifically MSE and R². However, relying solely on these metrics may obscure the real performance of black-box models, such as those used in this research. I recommend incorporating graphical assessment metrics to provide a more comprehensive evaluation. Suggested visual diagnostics include:

- Scatter plots between observed and predicted target values
- Residual distribution plots to assess error patterns
- Residual QQ plots to check normality assumptions
- CURE plots to analyze cumulative residuals

This is particularly important because R² can misrepresent model performance, potentially underestimating certain ranges of the dependent variable while overestimating others, leading to misleadingly high R² values. Similarly, MSE squares errors, which may reduce the visibility of outliers' impact. In addition to graphical diagnostics, the authors should also include additional numerical metrics, such as:

- Root Mean Square Error (RMSE) for interpretability in the same unit as the target variable
- Mean Absolute Error (MAE) for a more balanced error assessment

The reported model performance is not clearly distinguished between the training, validation, and test phases, making it difficult to assess potential overfitting issues. This is particularly important because Artificial Neural Networks (ANNs) and Gradient Boosting Regression (GBR) are prone to overfitting, especially when trained on small datasets. To address this, the authors should:
- Explicitly report performance metrics separately for training, validation, and test sets.
- Include regularization techniques (e.g., dropout for ANNs, early stopping for GBR) if not already considered.
- Provide evidence of generalization, such as learning curves or validation loss trends.

For further details on overfitting risks in ANNs, the authors may refer to https://doi.org/10.1177/03611981221111367

The conclusions should be reorganized to eliminate redundancy and improve clarity. To enhance readability and logical flow, the future research directions (lines 543–566) could be moved to a dedicated section at the end of the Results and Discussion rather than being included in the conclusions.

Throughout the paper, acronyms are repeatedly defined instead of being introduced only at their first occurrence. To improve readability and avoid redundancy, the authors should define each acronym only the first time it appears and then use it consistently throughout the text. This will enhance clarity and maintain a more professional and concise writing style.

Author Response

Please see the attachment

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

The manuscript "ARTIFICIAL INTELLIGENCE APPROACHES FOR ROUNDABOUT PATH PLANNING: ADVANCED ANALYSIS FOR AUTONOMOUS VEHICLES" presented an interesting approach to implementing various machine learning models, and neural networks. In their study, the authors demonstrated the development and evaluation of predictive models that aim to replicate human driving behavior in roundabouts, focusing on the critical path radii that define the fastest passage. The manuscript is relevant for the field and presented in a well-structured manner. Most of the cited references are recent publications. All images and tables shown are appropriate and easy to interpret and understand.

However, there are some questions and details that authors need to further explain and include in their manuscript in order to achieve better manuscript quality, such as:

- In line 409, it should be better explained why acceleration from s1 to s2 in TR2 is assumed and what would happen if the vehicle had deceleration through this section. The same question remains for the statement in line 410.

- In line 421, it should be precisely state which data relate to vehicle movement.

- In lines 424-425, it should be further explained why off-peak driving was chosen and what will change in the methodology if vehicles are moving during peak hours.

In any case, after these minor changes, the considered paper should be published.

Author Response

Please see the attachment

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

The abstract is too long, please consider reducing it. Additionally, it should be augmented with some quantitative measures of your results.
The use of M-roundabout with its major limitation in adjusting the trajectory when there is no traffic can be highly inaccurate in adopting it in this study due to the major differences in the environments.
The paper indicates that the provided approach provides a safe environment without providing any evidence of quantitative/qualitative measure of safety
The justifications provided for selecting the stated AI models in section 3.1 are not enough at all, more justifications needed, since there are other AI models that can provide much better results such as Deep Learning and Transformers based approaches. Hence, detailed justification of the usage of these models. And, actually, there are many recent research articles in this direction that use advanced models such as deep learning and deep re-enforcement leaning that never been compared against them in the experimental results.
Also, I wonder if linear regression can do this task or solve this problem given the complexity and non-linearity of the problem, and it is very clear in the numerical results in the paper that this is not a suitable model at all for this problem.
Line 446 "Linear regression, although simple and interpretable, showed quite acceptable performance" actually the performance of the linear regression model reveals that it is not acceptable at all in most of the stated cases in Table 2, and this exactly coincides with the comment number 5 above.
The results obtained should be justified not just listing the results after Table 2, actually, I can see the results in Table 2 clearly, I just need justifications why each model behaves like this.
The role of the Chat-GPT should be explained in details, instead of just mentioning it. Better to have a block diagram for the whole approach to indicate the role of each component in it. And, better to have a complete sub-section in the methodology that shows the detailed role of it in your approach.
The results section showed that all but the Neural network model provide poor performance (in most cases) or at most fluctuating performance without any clue why the performance fluctuates like that. Meanwhile, I can conclude from the results section that the adopted models (except for the neural network) are not the suitable models for tackling this problem, especially the linear regression as mentioned in the above comments.
The negative values of R-Squared as indicated in Table 2 are something when happen indicate a major problem in the learning process of the model. However, no word about why these values are negative, what is the impact of these negative values on adopting the stated model, and what is wrong with the approach itself. This specific point needs a lot of justification and explanation.
The paper tries to adopt unsuitable AI models for the given problem, however the research community actually provided much better AI model suitable for this important problem which should be investigated and compared with the provided approach.
Almost 45% of the listed references are outdated, however there much recent alternatives for them.
Many references cited in the literature survey are not directly related to the point and will not provide any contrast to the problem, I wonder why the paper surveyed these research works which are not closely related to the framework provided in that paper. Meanwhile, there are many recent references that tackle the same point which are completely ignored in the literature survey portion.
The results section lacks any comparison with other related models, which is provided can highlight the performance of the stated models with the other related ones.

Also, minor comments for the English writing of the paper:

English:

Line 243,244 the phrase: "The data set is divided into five 243 subsets." is repeated.
Line 367: "To ensure the model's robustness" should be "To ensure the model robustness"

Author Response

Please see the attachment

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

Dear Authors,

Thanks for addressing my concerns. In my opinion, the paper has been significantly improved thoroughly and can now be considered for publication in Sustainability.

Please see the following comments that should enhance the paper:

The abstract should be organized in just one paragraph;
As in the previous version of the manuscript, I still see that acronyms are defined more than one time (e.g., ML, AV, R2, MSE, SVR, RF,...). Please fix this issue;
In Table 2, the R2 (train) of NN (with R1-S1 variables) is higher than 1 (1.0395). This is not possible; please fix this issue;
In Table 2, the R2 (train, validation, test) of RF, SVR, GBR (with R3-DeltaS3 variables) is lower than 0. This is not possible; please fix this issue;
Important comment: The statement: "NNs prove to be the best performing model, consistently achieving the lowest MSE and highest R² across all datasets. Their strong generalization, as evidenced by minimal deviation between validation and test performance, indicates effective learning without excessive memorization, which is further supported by regularization techniques such as dropout and weight reduction" is not consistent with the results of Table 2, particularly considering the deviation between R2 train and R2 test observed in R1-S1, R2-S2. Indeed, for these pairs of variables, the NN is the one that exibits the highest drop in R2 performance. This is a clear symptom of overfitting of NN (which I suspected in the previous version of the paper). You can also see how MSE of NN is really lower than the other ML models. Please rearrange your discussion;
For small numbers, I suggest using scientif notation (e.g., 2.2E-7 instead of 0.000000220000);
Also, showing a MSE like 2.6390295071 is not useful. Show 2.64 is enough and makes Table 2 clearer and more readable. I suggest rearranging Table 2;
Important comment: In Figures 7-8-9-10-11-12-13-14-15, the CURE plots lacks from upper and lower limits. Without such limits, it is not possible to evaluate if deviations of residuals are like a "random walk" or not. Please see the seminal work of Hauer and Bamfo (1997), "Two Tools for Finding What Function links the dependent Variable to the Explanatory Variables";
Important comment: I think the scatterplots in Figures 7-8-9-10-11-12-13-14-15 are not showing correctly the observation-prediction pairs. Indeed, it is possible to see that they are very similar each other; nonetheless, they show a very different R2, that shoud indicate a different spread of points along the 45° degree line.

Author Response

Please see the attachment

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

The negative values of R-Squared should be further commented and explained why they are negative and what does this actually mean in terms of the suitability of the corresponding adopted model.

Author Response

Please see the attachment

Author Response File: Author Response.pdf

Article Menu

Advancing Sustainable Mobility: Artificial Intelligence Approaches for Autonomous Vehicle Trajectories in Roundabouts

Further Information

Guidelines

MDPI Initiatives

Follow MDPI