Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

MultiNet-GS: Structured Road Perception Model Based on Multi-Task Convolutional Neural Network

Electronics 2023, 12(19), 3994; https://doi.org/10.3390/electronics12193994

by Ang Li¹

, Zhaoyang Zhang^1,*, Shijie Sun¹, Mingtao Feng^2,3 and Chengzhong Wu³

Reviewer 1: Anonymous

Reviewer 2:

Dina M. Ibrahim

Reviewer 3:

Janusz Bobulski

Reviewer 4:

Xuelu Li

Electronics 2023, 12(19), 3994; https://doi.org/10.3390/electronics12193994

Submission received: 6 August 2023 / Revised: 17 September 2023 / Accepted: 20 September 2023 / Published: 22 September 2023

(This article belongs to the Special Issue Machine Learning Techniques in Autonomous Driving)

Round 1

Reviewer 1 Report

The article is devoted to the actual problem of environment perception in autonomous driving on structured roads. The authors use convolutional neural networks to solve this problem. Paper has practical value.

Suggestions:

1. The article does not state the research problem (verbal or mathematical).

2. The authors need to introduce a separate section for the mathematical description of the developed methods.

3. Since the algorithms must work in real time, it is necessary to evaluate the computational complexity of the developed algorithms.

4. The authors need to provide experiments on the comprehensive assessment of the accuracy and time of the developed algorithms.

5. Increase the number of links. Please fix it using 3-5 year old papers in high-impact journals.

6. The conclusions in the article are not written specifically. The author must cite the scientific novelty and practical significance of the work.

7. Figure 2 in the introduction does not carry information. Therefore, it is desirable to remove it.

8. The authors need to improve the quality of Figure 3.

Author Response

Dear reviewer,

Thank you very much for your comments and suggestions. Those comments are all valuable and very helpful for revising and improving our paper, as well as the important guiding significance to our researches. We have studied comments carefully and have made correction which we hope meet with approval. Revised portion are marked in red in the paper. The main corrections in the paper and the responds to the reviewer's comments are as flowing :

In the Introduction section of this paper, we supplement the current research status and existing problems, as well as our subsequent contributions to the current problems.

We added a separate section to describe the mathematical method of the lane detection method, and supplement the mathematical formula. The previous several improvement parts are mainly the structural innovation of the CNN module, so there is not too much mathematical method description.

We added a new experiment to evaluate the number of parameters and computation of our model, using Params and FLOPs as evaluation metrics, and compared with the current mainstream multi-task network models.

Finally, we added an experiment on the NVIDIA Jetson TX2 embedded platform to comprehensively evaluate the performance of our algorithm model and compared it with the current mainstream multi-task network model.

According to your suggestion, we have added 8 references in the last 3-5 years.

We perfected the innovative and practical implications of our proposed model in the Conclusion section at the end of the paper, and supplemented the future work.

We removed original Figure 2.

We have optimized the original Figure 3 to make the text on the picture clearer and easier to read.

Best regards,

Mr. Ang Li

Author Response File: Author Response.pdf

Reviewer 2 Report

The paper is an interesting approach in proposing a MultiNet-GS model, a convolutional neural network model based on an encoder-decoder architecture that tackles multiple tasks simultaneously. The authors introduce a dynamic sparse attention mechanism, BiFormer, in the feature extraction part of the model to achieve more flexible computing resource allocation, which can significantly improve the computational efficiency and occupy a small computational overhead. In addition, they introduce a lightweight convolution, GSConv, in the feature fusion part of the network, which is used to build the Neck part into a new Slim-neck structure, so as to reduce the computational complexity and inference time of the detector.

The article is clear, the literature references are sufficient, and the results supported by examples. Experimental results are presented to highlight and validate the proposed approach with support of two case studies.

In a satisfactory manner, the basic purpose of the research has been described, but with some crucial comments that should be taken into consideration.

1. All The final results of the study with the experimental analysis should be written in the abstract including the performance metrics as a comparison with the previous studies.

2. It is better to add a paragraph showing the organization of the rest of the paper “The rest of this paper is organized as follows; Section 2 ………………..”

3. Future work section is needed after/with the conclusion section

Author Response

Dear reviewer,

We have supplemented the Abstract section with comparative experimental results.

We have added a paragraph at the end of the Introduction section to introduce the following structure of the article.

We have refined the Conclusion section and added future work at the end.

Best regards,

Mr. Ang Li

Author Response File: Author Response.pdf

Reviewer 3 Report

The authors used the main structure of the latest object detection model YOLOv8 model as the encoder structure of our model. They introduce a new dynamic sparse attention mechanism, BiFormer, in the feature extraction part of the model to achieve more flexible computing resource allocation, which can significantly improve the computational efficiency and occupy a small computational overhead. The experimental results show that the detection performance of the MultiNet-GS model proposed in this paper is better than that of SOTA(YOLOPv2) in the three tasks of traffic object detection, drivable area detection, and lane detection.

I recommend the publication.

Author Response

Dear reviewer,

Thank you very much for your comments and suggestions, as well as your recognition of our work. We have further optimized and improved the paper, and the modified part has been marked in red font.

Best regards,

Mr. Ang Li

Author Response File: Author Response.pdf

Reviewer 4 Report

In this paper, the authors propose a model that is able to finish the multitasks such as object detection, drivable and lane segmentation simultaneously. Overall the proposed method is well presented and a lot of ablation studies and experimental results are shown by comparing with the SOTA methods. Although the novelty is incremental, basically, it is like a combination of different well-known structures, the experimental results show the proposed method is very effective. There are only two concerns I have for this paper:
1. Can authors add one more dataset to compare with so the excellent performance of the proposed method is not biased on the specific dataset?
2. Is it possible to compare the fps of the video when only inference on CPU? This comparison is important to prove the effectiveness of the proposed method on device applications.

Author Response

Dear reviewer,

Due to the special file structure of the data set, there is only one multi-task traffic image data set on the Internet at present, and we are not able to produce a large number of our own data sets in a short time. Sorry.

According to your comments, we added the comparison experiment of the model on the embedded device NVIDIA Jetson TX2 to evaluate the detection performance of our model on the embedded platform.

Best regards,

Mr. Ang Li

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

1. The article does not state the research problem (verbal or mathematical).

This comment has not been corrected

2. The authors need to introduce a separate section for the mathematical description of the developed methods. This comment has not been corrected

3. Since the algorithms must work in real time, it is necessary to evaluate the computational complexity of the developed algorithms. This comment has not been corrected

4. The authors need to provide experiments on the comprehensive assessment of the accuracy and time of the developed algorithms. This comment has been corrected

5. Increase the number of links. Please fix it using 3-5 year old papers in high-impact journals. This comment has been corrected

6. The conclusions in the article are not written specifically. The author must cite the scientific novelty and practical significance of the work. This comment has been partially corrected

7. Figure 2 in the introduction does not carry information. Therefore, it is desirable to remove it. This comment has been corrected

8. The authors need to improve the quality of Figure 3. This comment has been corrected

Author Response

Dear reviewer,

We rewrite the Introduction section, describe the research background in detail, list the shortcomings of the existing multi-task network methods in detail, and propose our solutions to these shortcomings in turn, leading to our contribution.

We have added a Method Details section to the Methodology section to describe several of our proposed methods in detail, optimizing the structure of the Methodology section.

Sorry for misunderstanding your meaning before. Now, we have added Params(M) and FLOPs as evaluation indicators in each ablation experiment and comparison experiment to evaluate the amount of parameters and calculation of the model, and improved all experimental conclusions.

We have refined the experimental results in the Conclusion section, summarized the research questions raised in the Introduction section, elaborated the effectiveness of our proposed method for solving these questions, and illustrated the scientific novelty and practical significance of our proposed model.

Best regards,

Mr. Ang Li

Author Response File: Author Response.pdf

Round 3

Reviewer 1 Report

The authors took into account all comments. The article may be accepted for publication.

Article Menu

MultiNet-GS: Structured Road Perception Model Based on Multi-Task Convolutional Neural Network

Further Information

Guidelines

MDPI Initiatives

Follow MDPI