Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Printed Edition

A printed edition of this Special Issue is available at MDPI Books....

Share Help Cite Discuss in SciProfiles

Open AccessEditor’s ChoiceArticle

Peer-Review Record

V2X Network-Based Enhanced Cooperative Autonomous Driving for Urban Clusters in Real Time: A Model for Control, Optimization and Security

Electronics 2025, 14(8), 1629; https://doi.org/10.3390/electronics14081629

by Minseong Yoon¹

, Dongjun Seo¹

, Soyoung Kim² and Keecheon Kim^1,*

Reviewer 1: Anonymous

Reviewer 2:

Sahar Ebadinezhad

Reviewer 3:

Javier Sánchez-Soriano

Electronics 2025, 14(8), 1629; https://doi.org/10.3390/electronics14081629

Submission received: 14 March 2025 / Revised: 10 April 2025 / Accepted: 15 April 2025 / Published: 17 April 2025

(This article belongs to the Special Issue Security and Privacy for Modern Wireless Communication Systems, 2nd Edition)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

The paper considers a topical issue related to real-time cooperative autonomous driving in urban environment. The study provides a model for control, optimization and security for V2X networks. The paper contribution is a simulation of an V2X network for cooperative autonomous driving that optimizes onboard systems and ensures safety and real-time performance. The introduction states the problem clearly, but the authors devote twice as much space to explaining their methodology than to presenting the relevance and essence of the problem. References are relevant but incomplete. Some basic standards in the area are not cited. The authors explain the significance of their research, but this explanation should be at the end of Section 2, devoted to related works. The paper sounds technically. After the introduction and discussion on related works, the methodology used is presented and the proposed control model is described. The authors describe the method of autonomous driving, communication and security aspects in cooperative autonomous driving, the method for urban environment perception, and an algorithm for cluster joining and escape. Section 4 presents the results of simulation. The authors are encouraged to consider the following suggestions and remarks to improve the paper quality:

Suggestions

The pedestrian perception in the paper is based on a reference about the V2P network and a limit of breaking distance is discussed (ll.383-392), but it is well-known that V2X network is a bit different topologically and the safety margin for the distance is only 6.5m, so it might be worth adding some more discussion or real-time validation, if available.
Some refinement is necessary, as far as Algorithm 1 seems to return 'true' (see line 15), but not the output as it is stated at line 2.
The proposal uses EdDSA (line 489 and on) and good EdDSA curves selection is known to be challenging and time consuming, so the manuscript would become even better if it is explicitly mentioned the node and/or functionality of the proposed topology that is in charge of that kind of security task.
The proposed scheme for anonymization of vehicle’s ID has to be explained with more details. How often do nicknames of the vehicles change?
The authors should compare their anonymization scheme with existing standards such as IEEE Standard 1609.2, ETSI EN 302 665, and ETSI TS 102 940, as well as with those already proposed.
What is the scope of the CARLA server? Is it necessary to have such a server in every cell? And isn't this very burdensome for the telecom operator?
If the traffic lights are not working and there is a traffic controller at an intersection, will he be considered a pedestrian and what is the behavior of the vehicle in this case?

Remarks

The introduction contains too much abbreviations which are not explained at their first usage, but as key words (avoiding many abbreviations in the abstract is desirable).
In the text, all abbreviations such as V2I, PC5, NR etc. must be given in full before to be used.
V2P does not mean “Vehicle-to-Passenger(V2P)” (line 113), but - Vehicle-to-Pedestrian.
Expression such as “allow you to …” should not be used (line 89).
In subsections 2.1-2.3 present tense is used and suddenly in 2.4 the tense is changed to past. Use only present tense.
Figure 2 does not show the “Structure of Pure-Pursuit algorithm”
Table 2 is not “a message structure”
Everywhere in the text, put a space between abbreviation and its full description.
The capture of Figure 4. “Figure 4. Network traffic flow for vehicle information…” must be “Figure 4. Flow for vehicle information…” (network traffic has a different meaning). Further to avoid confusion, indicate CARLA edge server instead of 5G NR edge server.
“For AI-based perception, the traffic light perception accuracy was 70.7727%, and the time efficiency was 392.074 ms”: if the AI model is developed by the other authors, then cite them, otherwise it does not worth to mention as the accuracy and time efficiency are low.
There are a lot of typos and semantic errors that have to be removed.
References and not formatted as to the template.
At the end of the paper, a table with abbreviations is missing. As to much abbreviations are used, this table is necessary.

Comments on the Quality of English Language

The manuscript requires native English corrections.

Author Response

We would like to express our sincere appreciation to you for their thorough and insightful comments.
We have carefully revised the manuscript based on the suggestions provided, and below are our detailed responses to each point.

Suggestions

Comments 1: The pedestrian perception in the paper is based on a reference about the V2P network and a limit of breaking distance is discussed (ll.383-392), but it is well-known that V2X network is a bit different topologically and the safety margin for the distance is only 6.5m, so it might be worth adding some more discussion or real-time validation, if available.

Response 1: We greatly appreciate your valuable comment and agree with the observation. As rightly pointed out, the topological structure of V2X communication is distinct from that of V2P, particularly with respect to communication flow and coverage. To address this, we have clarified in the revised manuscript Section 3.3.2 (line 421) that pedestrian detection in our study is based specifically on V2P communication within the broader V2X network.
Also we have expanded the discussion in Section 3.3.2 (lines 428–434) and Section 5 (lines 722–727). In the revised manuscript, we now clarify that the average vehicle speed in this study is 30 km/h, requiring a braking distance of only 1.5 m, which is well within the predefined 6.5 m margin. However, considering real urban conditions in South Korea, where average speeds can reach 50 km/h, we have included a dynamic threshold adjustment mechanism. This algorithmic adjustment increases the braking distance threshold to 20 m when vehicle speeds exceed 40 km/h.
In accordance with your suggestion, we plan to extend this research in the future by incorporating dynamic avoidance path generation instead of complete vehicle stops in response to pedestrian detection. To further enhance system precision, we have also considered restricting the V2P communication boundary to valid road areas using GPS-based geographic filtering, in order to eliminate unnecessary pedestrian detections beyond the effective road perimeter.
We sincerely thank you for the constructive suggestion.

Comments 2: Some refinement is necessary, as far as Algorithm 1 seems to return 'true' (see line 15), but not the output as it is stated at line 2.

Response 2: We sincerely thank you for pointing out this discrepancy. As pointed out, the initial version of Algorithm 1 indeed ended with a simple return True, which did not explicitly reflect the outputs stated in line 2. Based on this observation, we revised the algorithm to return the variables specified in the Output description - namely, data["Message Type] and the IDs of the escaped and joined vehicles.
The current version of Section 3.4 Algorithm 1 (lines 15) has been updated accordingly, and now aligns with the intended function behavior. This refinement improves the clarity and consistency between the algorithm’s definition and its described output. We sincerely thank you for the constructive suggestion.

Comments 3: The proposal uses EdDSA (line 489 and on) and good EdDSA curves selection is known to be challenging and time consuming, so the manuscript would become even better if it is explicitly mentioned the node and/or functionality of the proposed topology that is in charge of that kind of security task.

Response 3: We thank you for the valuable comment regarding the use of EdDSA and the importance of explicitly specifying the responsible node or functionality.
As correctly pointed out, choosing appropriate EdDSA curves can be both challenging and time-consuming. In response to this, we have clarified in Section 3.2 (lines 340–341) that “This EdDSA-based signature is used only during the ‘Init’ stage to authenticate the pseudonymized ID.”
Additionally, in Section 4.2.2 (lines 545–547), we emphasize that EdDSA is selectively applied only during the initialization phase to reduce signature generation latency, while still ensuring authentication.
We believe this clarification enhances the readability and practical relevance of the proposed topology. We sincerely thank you for the constructive suggestion.

Comments 4: The proposed scheme for anonymization of vehicle’s ID has to be explained with more details. How often do nicknames of the vehicles change?

Response 4: We sincerely thank you for raising this important point regarding the anonymization mechanism of vehicle IDs.
To address this, we have revised Section 4.2.2 (lines 534–535) to clearly specify the update frequency of pseudonymized IDs. As now stated in the manuscript, “the pseudonymized vehicle ID is updated either per session or periodically when a vehicle connects to a cluster network,” which effectively prevents long-term tracking of a vehicle’s location and behavioral patterns.
In the experimental environment, the update frequency was set relatively high for simulation purposes. Therefore, a more realistic and efficient pseudonym update interval will be further explored in future research.
We sincerely appreciate your insightful comment and kindly ask for understanding regarding this current limitation.

Comments 5: The authors should compare their anonymization scheme with existing standards such as IEEE Standard 1609.2, ETSI EN 302 665, and ETSI TS 102 940, as well as with those already proposed.

Response 5: We thank you for the thoughtful suggestion to compare our anonymization scheme with existing V2X security standards.
To address this, we have added a comparative discussion in Section 4.2.2 (lines 550–556), referencing relevant standards such as IEEE 1609.2. Specifically, we clarify that our proposed scheme adopts the core cryptographic primitives recommended by these standards - namely, EdDSA and AES - for securing V2X communications.
In addition, our scheme extends these foundations by incorporating advanced features such as pseudonymized ID management and session-based identity handling. These enhancements not only align with the principles of the existing standards but also improve real-time performance and communication efficiency.
We believe this clarification reinforces the practical relevance and technical soundness of our approach in the context of cooperative autonomous driving. We sincerely thank you for the constructive suggestion.

Comments 6: What is the scope of the CARLA server? Is it necessary to have such a server in every cell? And isn't this very burdensome for the telecom operator?

Response 6: We thank you for raising this important question regarding the scope and practicality of the CARLA server deployment. As clarified in the revised manuscript Section 3.2 (lines 207–300), the CARLA simulator is used to model an edge server operating within a single wireless network cell, representing the coverage area of a typical 5G base station.
This setting is intended purely for simulation purposes, particularly to model communication delays and cluster-level control within a realistic cellular topology. It does not imply that a dedicated server must be deployed in every cell, nor does it suggest a mandatory operational burden for telecom providers. Rather, it provides a conceptual framework for evaluating V2X coordination at the edge.
We hope this clarification removes any ambiguity and reassures readers of the scalability and feasibility of the proposed system. We sincerely thank you for the constructive suggestion.

Comments 7: If the traffic lights are not working and there is a traffic controller at an intersection, will he be considered a pedestrian and what is the behavior of the vehicle in this case?

Response 7: We sincerely appreciate your thoughtful and detailed question regarding the handling of traffic controllers at intersections in the absence of functioning traffic lights.
In this study, traffic light recognition is implemented based on an HSV-based image processing method. Therefore, if traffic lights are non-operational, the system interprets this as a “no_signal” state, in which the vehicle maintains its previous control command.
The simulation was conducted under typical urban environments without the presence of human traffic controllers. Due to current simulation constraints, modeling such human intervention scenarios was not feasible. However, for future work involving real-world deployment of cooperative autonomous driving, we plan to explore approaches such as using camera-based detection of traffic controllers or encoding controller presence into V2X messages for route adaptation and safe maneuvering.
We sincerely apologize for not addressing this specific situation in the current study and respectfully ask for your understanding regarding this limitation.

Remarks

Comments 1: The introduction contains too much abbreviations which are not explained at their first usage, but as key words (avoiding many abbreviations in the abstract is desirable).

Response 1: We thank you for the valuable observation regarding the excessive use of unexplained abbreviations in the abstract and introduction.
In response, we have carefully revised the manuscript to ensure that all key abbreviations are introduced appropriately at their first occurrence. Specifically, in the abstract, we now provide full definitions for V2X (Vehicle-to-Everything), TSB (Throttle-Steer-Brake), AES (Advanced Encryption Standard), and EdDSA (Edwards-curve Digital Signature Algorithm), which are essential to understanding the scope and security mechanisms of the study.
In the introduction, we have clarified the meanings of PID (Proportional-Integral-Differential), AI (Artificial Intelligence), and HSV (Hue-Saturation-Value) to improve readability.
Furthermore, all abbreviations used throughout the manuscript are now also listed in the Abbreviations section for ease of reference.
We sincerely appreciate your suggestion and apologize for any confusion the initial version may have caused.

Comments 2: In the text, all abbreviations such as V2I, PC5, NR etc. must be given in full before to be used.

Response 2: We thank you for the careful observation regarding the proper introduction of abbreviations.
In response, we have ensured that each abbreviation - V2I (Vehicle-to-Infrastructure), NR (New Radio), and PC5 (the direct communication method between vehicles and other devices) - is now introduced with its full form at the point of first usage.
V2I is defined in Section 2.1 (line 79), NR is defined in Section 2.3 (line 131), PC5 is defined in Section 3.2 (lines 301-302).
We appreciate your helpful comment, and we hope these additions improve the manuscript’s clarity and accessibility for readers unfamiliar with these terms.

Comments 3: V2P does not mean “Vehicle-to-Passenger(V2P)” (line 113), but - Vehicle-to-Pedestrian.

Response 3: We sincerely thank you for pointing out the misinterpretation of the V2P abbreviation.
In response, we have corrected the definition in Section 2.3 (line 114) to accurately state that V2P refers to Vehicle-to-Pedestrian communication, not Vehicle-to-Passenger as previously written.
We appreciate the your attention to detail and apologize for the oversight in the initial version.

Comments 4: Expression such as “allow you to …” should not be used (line 89).

Response 4: We thank you for highlighting the inappropriate use of second-person expressions.
In response, the sentence in Section 2.1 (line 90) has been revised to: “Vehicular Ad-hoc Networks (VANETs) are utilized to distribute fundamental tasks of autonomous driving among agents.”
This revision eliminates the informal tone and ensures consistency with academic writing conventions. We sincerely thank you for the constructive suggestion.

Comments 5: In subsections 2.1-2.3 present tense is used and suddenly in 2.4 the tense is changed to past. Use only present tense.

Response 5: We thank you for pointing out the inconsistency in verb tense across Sections 2.1–2.4.
To maintain stylistic and grammatical consistency throughout the manuscript, we have revised Section 2.4 and other relevant parts of the paper to consistently use the present tense. We sincerely thank you for the constructive suggestion.

Comments 6: Figure 2 does not show the “Structure of Pure-Pursuit algorithm"

Response 6: We thank you for the insightful comment regarding the title and content of Figure 2.
In response, we have revised the figure title to: “Look Ahead Path Following with Target Point and Actual Vehicle Path”,
which more accurately reflects the content illustrated - namely, the geometric relationship between the target point and the vehicle’s steering behavior in a look-ahead based controller.
Additionally, we clarify in the text that the look-ahead distance is used to compute the steering angle, and the associated formulation is explicitly provided in Equation (2).
We hope this correction eliminates any ambiguity and improves the interpretability of the figure. We sincerely thank you for the constructive suggestion.

Comments 7: Table 2 is not “a message structure"

Response 7: We thank you for pointing out the need to refine the labeling and description of Table 2.
To address this, we have revised the table title to: “Message packet details for V2X network,” which more accurately reflects the nature of the content - a field-wise breakdown of the message elements used in V2X communication.
Furthermore, in Section 3.2 (lines 327–328), we now explicitly explain that "Table 2 presents the message format used for inter-vehicle communication, which is designed to securely transmit vehicle identification and driving information."
We believe these revisions improve the clarity and alignment between the table content and its title, and we appreciate your insightful suggestion.

Comments 8: Everywhere in the text, put a space between abbreviation and its full description.

Response 8: We thank you for the stylistic recommendation.
We have thoroughly reviewed the manuscript and ensured that a space is consistently included between each abbreviation and its corresponding full description throughout the text. We sincerely thank you for the constructive suggestion.

Comments 9: The capture of Figure 4. “Figure 4. Network traffic flow for vehicle information…” must be “Figure 4. Flow for vehicle information…” (network traffic has a different meaning). Further to avoid confusion, indicate CARLA edge server instead of 5G NR edge server.

Response 9: We thank you for the insightful feedback regarding the labeling and clarity of Figure 4.
In response, we have revised the figure title to “Flow for vehicle information exchange and cluster management”, to avoid the misleading implication of low-level network traffic.
The “5G NR Edge Server” label has been changed to “CARLA Server (Edge Server)” to reflect the simulation environment more accurately and avoid confusion with physical 5G infrastructure.
Additionally, we have color-coded the message types (synchronous, asynchronous, and reply) to enhance the visual clarity of the process flow.
We believe these revisions significantly improve the figure’s accuracy and comprehensibility, and we sincerely appreciate your suggestion.

Comments 10: “For AI-based perception, the traffic light perception accuracy was 70.7727%, and the time efficiency was 392.074 ms”: if the AI model is developed by the other authors, then cite them, otherwise it does not worth to mention as the accuracy and time efficiency are low.

Response 10: We thank you for the valuable comment regarding the AI-based perception results.
As suggested, we have removed the numerical values for accuracy and time efficiency from Section 4.3.1 (line 586–587), as the model was not developed by the authors and its performance is relatively low in the context of this study.
The revised sentence is “The AI-based perception method demonstrates relatively lower performance in the experimental evaluation.”
We believe this modification improves the clarity and relevance of the section while avoiding unnecessary emphasis on third-party performance metrics. We sincerely thank you for the constructive suggestion.

Comments 11: There are a lot of typos and semantic errors that have to be removed.

Response 11: We sincerely appreciate your comment regarding typographical, semantic, and grammatical issues.
In response, all authors have carefully re-reviewed the manuscript to correct typographical errors and eliminate semantic ambiguities. We have also made every effort to revise grammatical inconsistencies throughout the paper to ensure clarity and correctness. If any further revisions are required, we are prepared to use the language editing service provided by MDPI Author Services to minimize potential errors.
We believe these revisions have significantly improved the overall readability and academic quality of the manuscript. We sincerely thank you for the constructive suggestion.

Comments 12: References and not formatted as to the template.

Response 12: We thank you for pointing out the formatting inconsistencies related to references and mandatory manuscript sections.
In response, the References section has been revised to fully comply with the MDPI formatting style. We have also added the required sections, including Author Contributions, Funding, Institutional Review Board Statement, Data Availability Statement, Conflicts of Interest, and Abbreviations, in accordance with the journal’s submission guidelines.
We think these updates ensure that the manuscript now adheres fully to the MDPI Electronics template requirements. We sincerely thank you for the constructive suggestion.

Comments 13: At the end of the paper, a table with abbreviations is missing. As to much abbreviations are used, this table is necessary.

Response 13: We thank you for the helpful suggestion.
To improve clarity and prevent confusion due to the extensive use of abbreviations, we have added a comprehensive Abbreviations table at the end of the manuscript.
This table lists all abbreviations used in the paper along with their full definitions, following the journal’s formatting guidelines. We sincerely thank you for the constructive suggestion.

We sincerely thank you for their valuable comments and constructive feedback.
We have carefully revised the manuscript in response to each point raised, and we believe that these improvements have significantly enhanced the clarity, accuracy, and quality of our work.

Reviewer 2 Report

Comments and Suggestions for Authors

- From line 30 to 61, it doesn't have a citation.
- The literature gap should be mentioned at the end of section 2.
- In Figure 4, all the messaging types are illustrated the same, but in fact, the reply message , sync message, and async message are different. This figure should be revised.
- Authors should show the algorithm of the main proposed system that included 3 main steps as they claim.
- Real-time performance (latency, processing time), accuracy and dependability (perception accuracy, positioning error, message delivery success rate), security (encryption overhead, authentication success rate), traffic efficiency (cluster formation time, TSB efficiency gain), and safety (collision rate, time-to-collision) are all important performance metrics to support this methodology. By assessing these factors, a reliable, scalable, and secure V2X-based cooperative autonomous driving system that is appropriate for practical implementation is guaranteed.
- Author contributions, funding, and conflicts of interest should be mentioned before the reference list.

Comments on the Quality of English Language

Author Response

Comments and Suggestions for Authors

Comments 1: From line 30 to 61, it doesn't have a citation.

Response 1: We thank you for pointing out the lack of citations in lines 30 to 61.
To address this, we have added appropriate references to support the statements in the corresponding lines: Reference [3] at line 31, Reference [4] at line 38, Reference [5] at line 44, Reference [6] at line 52.
We believe that these additions strengthen the academic foundation and credibility of the introductory discussion. We sincerely thank you once again for the helpful and detailed feedback.

Comments 2: The literature gap should be mentioned at the end of section 2.

Response 2: We thank you for highlighting the importance of explicitly identifying the literature gap.
To address this, we have added a new subsection—Section 2.5, titled “Contribution in Integrated System” - at the end of the related work section. This subsection outlines how our proposed system differs from existing approaches and clearly articulates the contribution of our work in filling the identified research gaps.
We believe this addition provides a clearer context for the motivation and novelty of our study. We sincerely thank you for the constructive suggestion.

Comments 3: In Figure 4, all the messaging types are illustrated the same, but in fact, the reply message, sync message, and async message are different. This figure should be revised.

Response 3: We thank the you for the insightful feedback regarding the representation of message types in Figure 4.
In response, we have revised the figure to visually differentiate between the three message types: synchronous, asynchronous, and reply messages. Each message type is now color-coded and labeled (synchronous messages are shown in green, asynchronous messages in red, and reply messages in blue) to clearly reflect their roles and flows within the cooperative vehicle information exchange and cluster management process.
We believe this enhancement significantly improves the interpretability of the figure and more accurately represents the communication dynamics of the proposed system. We sincerely thank you for the constructive suggestion.

Comments 4: Authors should show the algorithm of the main proposed system that included 3 main steps as they claim.

Response 4: We thank the you for the helpful suggestion to clearly present the main algorithm of the proposed system.
In response, we have added Algorithm 2 in Section 4.5.2, which illustrates the three core stages of our integrated approach:
1. Pedestrian detection using camera input
2. Traffic light recognition via an HSV-based method
3. Autonomous vehicle control based on the vehicle's role (leader or follower) and V2X communication
This algorithm also clarifies how driving decisions are made depending on the message type (e.g., “waypoints” for path tracking, or “TSB” for direct control input).
We believe this addition enhances the understanding of our system architecture and its operation in a cooperative V2X environment. We sincerely thank you for the constructive suggestion.

Comments 5: Real-time performance (latency, processing time), accuracy and dependability (perception accuracy, positioning error, message delivery success rate), security (encryption overhead, authentication success rate), traffic efficiency (cluster formation time, TSB efficiency gain), and safety (collision rate, time-to-collision) are all important performance metrics to support this methodology. By assessing these factors, a reliable, scalable, and secure V2X-based cooperative autonomous driving system that is appropriate for practical implementation is guaranteed.

Response 5: We thank you for emphasizing the importance of comprehensive performance metrics to support the proposed methodology.
In response, we have addressed key performance indicators as follows. In Section 4.5.3 (lines 689–695), we evaluated the real-time performance by measuring the V2X server transmission time and the cluster joining/escape time, both of which were under 1 ms. Specifically, the V2X server’s transmission time averaged 0.152 ms (verified over 1781 instances), and the cluster join/escape process averaged 0.23 ms (verified over 1015 instances). The TSB efficiency is presented in Section 4.1, Table 3. The traffic light perception accuracy is presented in Section 4.3.1, Table 4. As the study is conducted in a simulator environment, external physical influences are absent, and no communication errors or collisions were observed (collision rate = 0%). To acknowledge this limitation, Section 5 (lines 725–728) has been revised to include a statement on the need for further real-world evaluation where dynamic factors may impact system performance.
We sincerely thank you once again for the thoughtful and detailed suggestions, which helped improve the rigor and completeness of our evaluation.

Comments 6: Author contributions, funding, and conflicts of interest should be mentioned before the reference list.

Response 6: We thank the you for the formatting recommendation.
In accordance with the MDPI Electronics journal template, we have added and correctly positioned the required sections - Author Contributions, Funding, Institutional Review Board Statement, Data Availability Statement, Conflicts of Interest, and Abbreviations - immediately before the reference list.
This ensures compliance with the journal's formatting standards and improves the clarity and transparency of the manuscript. We sincerely thank you for the constructive suggestion.

Reviewer 3 Report

Comments and Suggestions for Authors

I would like to thank the authors for submitting their manuscript entitled “V2X Network Based Enhanced Cooperative Autonomous Driving for Urban Clusters in Real Time: A Model for Control, Optimization and Security”, which presents a new driving method called TSB (Throttle-Steer-Brake) to optimize real-time performance and minimize the computational load on subsequent vehicles.

The authors’ idea is clearly explained, and experimental results are promising. However, there are several aspects that would benefit from further clarification and refinement. Addressing the following comments will strengthen the manuscript and enhance its overall quality.

Comments and suggestions to authors

It would be useful to highlight more directly the specific gaps in the research that the article aims to address, with explicit references to relevant studies.
The Related Work is detailed, but a stronger and more explicit connection between the limitations identified in the introduction and how the present work is positioned against the existing literature from the outset would strengthen the context. In addition, it is recommended to expand the review with current references, as the present ones may be somewhat outdated in most cases.
Detailed information on the specific urban environment simulated in CARLA is not provided: Details on street design, traffic light layout, pedestrian density, etc. would be necessary for replication.
It is recommended that the description of the V2X network implementation in CARLA could be improved with more details. Although the use of socket programming for communication and packet structure is mentioned, the details of the network configuration, simulated latency, and how the PC5 and Uu interfaces were emulated are not very detailed.
There is no mention of the availability of the source code of the implemented algorithms and simulation scripts. Publishing the source code (e.g. in a repository like GitHub) is highly recommended to improve transparency and reproducibility. Have you contemplated sharing them?
The dataset used to train the perception model description is very brief. The collection of red, yellow and green traffic light images and their labelling is mentioned, but no details are provided on the size of the dataset, the distribution of the classes, the data augmentation techniques used. More information on the dataset should be provided or, ideally, made public or detail how it was generated in CARLA. Have you contemplated sharing them?
No information is provided about the HW (GPU, CPU, RAM, ...) used for training and running the experiments.
There is no mention of the availability of source code, datasets used for prediction with GPR (beyond the general description of historical data collection) or replication scripts. This lack is a significant limitation to replicability. Have you contemplated sharing them?
It would be interesting to go deeper into the discussion section on the possible resistance of the implemented security mechanisms to simulated attacks (even at a conceptual level). Also, to address the possible limitations of the study because the validity of HSV-based traffic light perception could be affected by real-world conditions that are not fully simulated. Additionally, the ‘low’ impact on the computational load of the following vehicle when using TSB driving could be more precisely quantified in the conclusions. It cannot be overlooked that the gap between simulation and real-world implementation remains a point that partially weakens the strength of the argument in terms of immediate applicability, so this should also be addressed.
Several sections required by the journal are missing between section 5. and the References: Supplementary Materials, Author Contributions, Funding, Acknowledgments and Conflicts of Interest.

Author Response

Comments and suggestions to authors

Comments 1: It would be useful to highlight more directly the specific gaps in the research that the article aims to address, with explicit references to relevant studies.

Response 1: We thank you for the valuable suggestion to clarify the specific research gaps addressed by this study.
In response, we have expanded Sections 2.1 through Section 2.4 to describe the limitations and gaps in previous studies in a more explicit and structured manner, supported by relevant references. Additionally, we have added Section 2.5, titled “Contribution in Integrated System,” to clearly highlight the specific research challenges that this work addresses and how our approach differentiates itself from prior studies.
We sincerely appreciate your feedback, which helped improve the clarity and motivation of our research.

Comments 2: The Related Work is detailed, but a stronger and more explicit connection between the limitations identified in the introduction and how the present work is positioned against the existing literature from the outset would strengthen the context. In addition, it is recommended to expand the review with current references, as the present ones may be somewhat outdated in most cases.

Response 2: We thank your for the constructive feedback regarding the contextual linkage between the introduction and the related work, as well as the recommendation to update the references.
To address this, we have added Section 2.5, which clearly articulates how the present study addresses the limitations identified in the introduction and positions itself against existing literature. Furthermore, we have included several recent references to enhance the relevance of the literature review Reference [3] at line 31, Reference [7] at line 69, and Reference [34] at line 504 are newly added. In total, five newer studies have been incorporated to support up-to-date comparisons. While we have made every effort to cite recent work, we kindly ask for understanding that in the case of the pure-pursuit algorithm, the original formulation - although older - is referenced because it remains the foundational method implemented in our study.
We sincerely appreciate your helpful comments, which contributed to improving the rigor and clarity of our literature positioning.

Comments 3: Detailed information on the specific urban environment simulated in CARLA is not provided: Details on street design, traffic light layout, pedestrian density, etc. would be necessary for replication.

Response 3: We thank you for pointing out the lack of detailed information on the simulated urban environment, which is essential for reproducibility.
In response, we have updated Section 4 (lines 475–484) to provide a comprehensive description of the simulation setup. The experiment is conducted in TOWN 10 of the CARLA simulator, which has a grid layout with diverse intersections, including 4-way yellow-box junctions, dedicated turning lanes, and central reservations. Traffic lights are placed at every intersection where two or more roads meet, and the test route includes four such traffic light-controlled junctions. To evaluate interaction with pedestrians, we tested different densities. With 10 pedestrians, interactions were rare, and with 50, vehicles were frequently blocked. Thus, we set the number of pedestrians to 25 to create balanced and realistic interactions. The entire route, including departure and arrival points, as well as traffic light zones, is now illustrated in Fig. 6.
We hope this detailed clarification supports future replication and strengthens the reliability of our simulation methodology. We sincerely thank you for the constructive suggestion.

Comments 4: It is recommended that the description of the V2X network implementation in CARLA could be improved with more details. Although the use of socket programming for communication and packet structure is mentioned, the details of the network configuration, simulated latency, and how the PC5 and Uu interfaces were emulated are not very detailed.

Response 4: We thank you for pointing out the need for greater clarity in describing the V2X network implementation within the CARLA simulation.
In response, we have revised Section 3.2 (lines 300–308) to provide more detailed information regarding the communication model used in our system. Specifically, we clarify that V2V communication follows the PC5 interface concept (direct communication between vehicles), and V2I communication is modeled conceptually based on the Uu interface (communication via a cellular base station). These interfaces are not physically implemented or emulated, but instead, the communication is abstractly modeled using TCP-based socket communication. As the primary focus of this study is on evaluating control effectiveness rather than precise network modeling, aspects such as latency and wireless channel effects are not simulated.
We hope this explanation clarifies our implementation approach, and we sincerely thank you for the constructive suggestion that helped strengthen the methodological transparency of our work. We sincerely thank you for the constructive suggestion.

Comments 5: There is no mention of the availability of the source code of the implemented algorithms and simulation scripts. Publishing the source code (e.g. in a repository like GitHub) is highly recommended to improve transparency and reproducibility. Have you contemplated sharing them?

Response 5: We thank you for the valuable suggestion regarding code transparency and reproducibility.
In response, we have published the source code of the proposed system, including the implemented algorithms and simulation scripts, on GitHub: https://github.com/ggaebi99/cooperative-autonomous-driving-in-CARLA-Simulator. The repository includes a ReadMe file that explains how to execute the simulation. While the current version of the code is functional, some parts of the algorithm (e.g., sorting and optimization routines) have not yet been finalized.
We plan to complete and refine these parts and update the repository accordingly in the near future. The GitHub link is also included in the Data Availability Statement section at the end of the manuscript.
We sincerely thank you for encouraging open science and reproducibility.

Comments 6: The dataset used to train the perception model description is very brief. The collection of red, yellow and green traffic light images and their labelling is mentioned, but no details are provided on the size of the dataset, the distribution of the classes, the data augmentation techniques used. More information on the dataset should be provided or, ideally, made public or detail how it was generated in CARLA. Have you contemplated sharing them?

Response 6: We thank you for the thoughtful comment regarding the dataset used to train the perception model.
To address this concern, we have revised Section 4.3.1 (lines 576–577) to clearly state the number of images used for training and their distribution across traffic light classes (red, yellow, and green). In addition, we have shared the dataset publicly via GitHub: https://github.com/ggaebi99/Cooperative-autonomous-driving-in-CARLA-Simulator/tree/main/yolo. This dataset was generated using the CARLA simulator, and all images were manually labeled for supervised learning. Although no data augmentation techniques were applied in the current version, this is planned for future improvement.
We appreciate your suggestion, which helped enhance the transparency and reproducibility of our perception model pipeline.

Comments 7: No information is provided about the HW (GPU, CPU, RAM, ...) used for training and running the experiments.

Response 7: We thank you for pointing out the need to include hardware specifications used for training and experimentation.
To address this, we have updated Section 4 (lines 485–488) to specify the experimental environment as follows, CARLA Simulator version 0.9.12, NVIDIA RTX 3080 GPU, 64 GB RAM. This addition improves the reproducibility of our experiments and provides context for the system's performance characteristics.
We sincerely appreciate your helpful suggestion.

Comments 8: There is no mention of the availability of source code, datasets used for prediction with GPR (beyond the general description of historical data collection) or replication scripts. This lack is a significant limitation to replicability. Have you contemplated sharing them?

Response 8: We thank you for highlighting the importance of transparency and reproducibility regarding the GPR-based prediction component and related datasets.
To address this, we have made all relevant materials publicly available through our GitHub repository: https://github.com/ggaebi99/cooperative-autonomous-driving-in-CARLA-Simulator. This repository includes the full source code of the proposed system, simulation and control scripts, datasets used for GPR-based vehicle behavior prediction, and a ReadMe file that explains how to run the system and replicate the experiments. Although the current version focuses on core implementation, we plan to further organize and annotate the repository for easier access.
We sincerely appreciate your valuable suggestion, which helped us enhance the openness and reproducibility of this work.

Comments 9: It would be interesting to go deeper into the discussion section on the possible resistance of the implemented security mechanisms to simulated attacks (even at a conceptual level). Also, to address the possible limitations of the study because the validity of HSV-based traffic light perception could be affected by real-world conditions that are not fully simulated. Additionally, the ‘low’ impact on the computational load of the following vehicle when using TSB driving could be more precisely quantified in the conclusions. It cannot be overlooked that the gap between simulation and real-world implementation remains a point that partially weakens the strength of the argument in terms of immediate applicability, so this should also be addressed.

Response 9: We thank you for encouraging a deeper exploration of security resistance, perception limitations, computational impact, and simulation-to-reality considerations.
To address the first point, we have added a new Section 4.2.3 titled "Countermeasures Against Attack Simulations", which conceptually evaluates the robustness of the implemented security mechanisms against common V2X threats. In this section, we explain how the use of EdDSA digital signatures and AES encryption ensures message integrity and authenticity, mitigating man-in-the-middle attacks. We also discuss the inclusion of timestamps and checksums to prevent replay attacks, and the use of pseudonymized vehicle IDs with location-based encryption to prevent tracking and preserve privacy. These enhancements collectively demonstrate the system’s resilience against simulated attacks in a conceptual framework, strengthening the argument for security-by-design in cooperative autonomous systems.
We sincerely appreciate your insightful suggestion, which allowed us to highlight the practical security value of our proposed architecture.

Comments 10: Several sections required by the journal are missing between section 5. and the References: Supplementary Materials, Author Contributions, Funding, Acknowledgments and Conflicts of Interest

Response 10: We thank you for pointing out the missing sections required by the MDPI Electronics template.
In response, we have revised the manuscript to include the following sections at the end of the paper, prior to the References: Author Contributions, Funding, Institutional Review Board Statement, Data Availability Statement, Conflicts of Interest, Abbreviations. These additions ensure full compliance with the Electronics journal’s formatting and submission requirements.
We sincerely appreciate your attention to detail.

Article Menu

Printed Edition

V2X Network-Based Enhanced Cooperative Autonomous Driving for Urban Clusters in Real Time: A Model for Control, Optimization and Security

Further Information

Guidelines

MDPI Initiatives

Follow MDPI