Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

End-to-End Lane Detection: A Two-Branch Instance Segmentation Approach

Electronics 2025, 14(7), 1283; https://doi.org/10.3390/electronics14071283

by Ping Wang^1,2,*, Zhe Luo¹, Yunfei Zha³

, Yi Zhang^1,2 and Youming Tang⁴

Reviewer 1: Anonymous

Reviewer 2: Anonymous

Reviewer 3:

Zhaoxian Zhou

Electronics 2025, 14(7), 1283; https://doi.org/10.3390/electronics14071283

Submission received: 24 February 2025 / Revised: 21 March 2025 / Accepted: 23 March 2025 / Published: 25 March 2025

Round 1

Reviewer 1 Report (New Reviewer)

Comments and Suggestions for Authors

Please rewrite abstract part. Please more focus on novelness of your work. Simplify it slightly to make it more accessible.
The figures such as Figure 1, Figure 3, should be referenced more clearly in the text with a short description of what they illustrate.
The SE module explanation is well-covered, but mathematical clarity muat to be improved with a visual illustration of its impact on feature extraction.
PLease explaine why the selected loss functions were chosen over alternatives, such as Dice loss for segmentation tasks.
Please provide information about inference time per frame and hardware specifications.
Highlight why the proposed method outperforms others. In such situations as:

1) Is it better at detecting curved lanes?

2) Does it generalize better in low-light conditions?

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report (New Reviewer)

Comments and Suggestions for Authors

The manuscript presents an end-to-end lane detection approach based on a two-branch instance segmentation model. It leverages an encoder-decoder architecture, Feature Pyramid Networks (FPN), residual networks, and a weighted least squares fitting module to improve lane detection accuracy under complex conditions such as occlusions, varying lighting, and crowded scenarios. Experimental results on the CULane and TuSimple datasets show superior performance compared to existing methods. However, there are some major concerns that must be addressed before this work can be deemed suitable for publication.

1. Computational Complexity & Efficiency

•The proposed model integrates multiple components (FPN, SE module, residual network, least squares fitting), which likely increases computational overhead.

•No discussion on inference speed or real-time feasibility for autonomous driving applications.

2. Limited Discussion on Edge Cases

•The paper lacks a detailed failure case analysis (e.g., why and where the model underperforms).

3. Lack of Ablation Study

•An ablation study testing the contribution of key modules (e.g., FPN, SE module, least squares fitting) would provide more insight into their necessity.

4. Potential Overfitting to Training Datasets

•The high F1 score (96.9%) on TuSimple suggests possible overfitting, especially since TuSimple consists of relatively simple highway scenarios.

•Unclear whether the model maintains its performance on more diverse datasets.

5. Comparison with State-of-the-Art Approaches

•While the manuscript compares with several lane detection methods, it does not include recent transformer-based models or hybrid approaches (e.g., BEV-based methods used in modern autonomous vehicles).

6. Hyperparameter Sensitivity

•The manuscript does not discuss how hyperparameters (e.g., weighting factors α and β in the loss function) were chosen and whether they generalize across datasets.

7. Clarity of Methodology Description

•Some parts of the proposed architecture (e.g., weight generation mechanism in the least squares branch) could be explained more clearly.

•Figures could be better annotated to enhance understanding.

•What does “Fsq” mean following each equation?

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 3 Report (New Reviewer)

Comments and Suggestions for Authors

Summary: The paper presents a dual-branch instance segmentation method for lane detection in autonomous vehicles. It is claimed that the method outperforms other (five) methods in accuracy.

Review
1. Does the introduction provide sufficient background and include all relevant references?
--- The introduction outlines the challenges of lane detection and presents a well-structured background on traditional and deep-learning-based approaches. However, because the application of the algorithm is in autonomous driving, the authors can review some methods based on cost effectiveness.
2. Is the research design appropriate?
--- The research design seems structured well, with experimental validation on the accuracy. It can be improved by adding discussion on computational resources.
3. Are the methods adequately described?
--- yes.
4. Are the results clearly presented? If not, which parts of the conclusion should be revised?
--- Yes. The results are presented with tables and comparisons to existing methods.
5. Are the figures and tables clearly explained?
--- Figures effectively illustrate the architecture and model performance, but some diagrams (e.g., Figures 3 and 4 on Feature Pyramid Network and SE module) could include more explanatory captions to clarify their role in the pipeline. There are some descriptions (for example, lines 151 to 153) that are misplaced. Those blocks are in figure 3, but the descriptions seem for figure 2. This will confuse the readers.
6. Are the conclusions supported by the results?
---The conclusions are supported by experimental results, demonstrating improvements over prior models. But additional discussion on real-world implementation constraints and potential trade-offs (e.g., memory usage, processing time) would give readers a better view of the method.

Comments on the Quality of English Language

Quality of English Language:
--- there are multiples paragraphs that are not clear and need revision. For example, lines 223-224 is not clear what ‘it” refers to. Also, there are several sentences that seem to be errors (the highlighted portion in lines 205, 213, 230, and more).

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report (New Reviewer)

Comments and Suggestions for Authors

Manuscript improved a lot

Author Response

Thank you for your thoughtful comments on the revised manuscript. We sincerely thank you for your recognition of the improvement, which is especially meaningful to us. Your academic rigor continues to inspire our pursuit of academic excellence.

Reviewer 2 Report (New Reviewer)

Comments and Suggestions for Authors

Thank you for the revision and response. The reviewer’s concerns have been addressed.

Please make sure your discussions about (1) dataset limitations (TuSimple) and (2) hyperparameters selections in the response letter are included in the manuscript.

Author Response

Comments 1: [Please make sure your discussions about (1) dataset limitations (TuSimple) and (2) hyperparameters selections in the response letter are included in the manuscript.]

Response 1: Thank you for pointing this out. We agree with this comment. Therefore, we have added the discussion of hyperparameter selection to Section 2.4, lines 306-314.

This manuscript is a resubmission of an earlier submission. The following is a list of the peer review reports and author responses from that submission.

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

The network model structure is not detailed enough.

1. "First, an encoder - decoder architecture is used to enhance the model’s ability to recover lane line details", however, such a detailed network structure diagram is not found in the paper. The author must complete the network structure diagram; otherwise, readers won't be able to reproduce the author's results. The model structure of the segmentation network mainly includes an encoder, a decoder, and some structures also include a classifier. First, the encoder's core function is to extract image features. This is achieved through a series of convolutional layers and pooling layers. The convolutional layer uses convolutional kernels to perform convolutional operations on the image to extract local features, and each convolutional kernel generates a feature map. The author must clarify the number of convolutional kernels and the size of the convolutional kernels in each layer. The pooling layer follows the convolutional layer. Modes like max - pooling and average - pooling mainly reduce the data dimension and retain key feature information. The author must clarify the implementation method of each pooling layer. Next is the decoder, whose purpose is to restore the high - level semantic features extracted by the encoder to a segmentation result with the same size as the input image. The author must clarify what techniques are used to achieve this. Moreover, skip - connections are often used in the decoder, and the author must clarify how to implement skip - connections. Finally, for the classifier part, the author must clarify the method for determining the category to which a pixel belongs, so as to finally obtain the segmented image.

2. "a cross - scale feature map is constructed by combining Feature Pyramid Networks (FPN) and high - level residual networks to capture lane line features at different scales". FPN is an existing algorithm, not the author's innovation. What improvements has the author made to this algorithm? Where in the network does the author place FPN? The author must clarify these network structures; otherwise, readers won't be able to reproduce the content in the paper.

3. Figures 2 to 5 are all existing and mature algorithms, which are widely used in building deep - learning models. The author has not made any improvements. What is the significance of listing these algorithms?

Comments on the Quality of English Language

The paper contains a large number of English grammatical errors, and only a small part of them are listed here:

1.In parallel structures, the grammatical forms of some sentences are inconsistent.“This method integrates geometric and visual features, which enhances adaptability to complex and dynamic environmental conditions, improving the model’s performance under varying road and weather conditions.” Suggested Revision: “This method integrates geometric and visual features, enhancing adaptability to complex and dynamic environmental conditions and improving the model’s performance under varying road and weather conditions.”

2.The use of articles is incorrect in some sentences. "...improving the detection accuracy and robustness of lane detection..." Suggested Revision: "...improving detection accuracy and robustness of lane detection..." (Remove the unnecessary article)

3.The relative pronouns in some attributive clauses are used incorrectly. "...the method uses ResNet50 as the backbone network. ResNet50 consists of five convolutional blocks..."，Suggested Revision: "...the method uses ResNet50 as the backbone network, which consists of five convolutional blocks..."

Reviewer 2 Report

Comments and Suggestions for Authors

This paper proposes a dual-branch segmentation method for lane detection and evaluates it on the CULane and TuSimple datasets, demonstrating robustness in complex environments. It is a valuable study; however, several areas lack clear explanations. The authors compare their method with multiple existing approaches but fail to introduce these methods before presenting results, making it difficult to follow. Additionally, abbreviations are used without explanation, adding to the confusion. The choice of different comparison methods for the two datasets is also unexplained. Furthermore, figures and equations are only briefly mentioned, with many symbols left undefined. To improve clarity, I have outlined the following recommendations:

1. If the journal does not specifically require section numbering to start from 0, please update the section numbers to begin from 1, such as 1. Introduction instead of 0. Introduction."

2. Ensure that when using an abbreviation for the first time, the full name is included, such as ‘SCNN’ on line 41 and ‘CurveLanes-NAS’ on line 48. Please review the entire paper to ensure this issue does not occur elsewhere.

3. Line 153, which starts with 'where X represents,' should be connected to the previous paragraph rather than starting a new one. Additionally, 'X' does not appear in Figure 2. Furthermore, the figure includes 'BN' and 'ReLU'—please clarify their meanings. Also, explain what '1×1' and '3×3' represent in the context of the figure.

4. There are issues with how the figures are explained. You cannot simply mention the figure in a single sentence without providing a detailed explanation. It is especially important to clarify the different symbols used in the figures, such as 'S', 'U', and 'P' as their meanings are not defined. Review all figures and ensure that every symbol used is clearly explained.

5. There are issues with the equations, all the symbols on the equations need to explain like the Fsq on the equation 1, review all the equations and ensure that every symbol used is clearly explained.

6. Line 297: Clearly state that PF1 is the “harmonic” mean of precision and recall.

7. The explanation is unclear. In Table 1, you compare Res50, SCNN, Res34-SAD, and Res34-Ultra, but then introduce E2Enet, SCNN, LaneATT, and UFLD without prior mention. What are E2Enet, LaneATT, and UFLD? They need to be introduced before being compared. Additionally, Res34-SAD and Res34-Ultra should also be explained if included in the comparison.

8. For the TuSimple dataset, you chose six methods for comparison. Why did you not include the same six methods when evaluating the CULane dataset?

9. Explain how the lane detection results in Table 1 were obtained. Were the lane detection outputs manually verified by humans checking the figures one by one? Additionally, why does the F1-score for the "cross" scene appear as numbers without further clarification?

10. Why does Table 1 present F1-scores as percentages, while Table 2 does not follow the same format? Ensure consistency in representation.

11. There is a typo on line 326, where "Figure ??" appears instead of a proper reference.

12. The values of FP (False Positives) and FN (False Negatives) in Table 2 are somewhat confusing. Since they represent counts of incorrect detections, they should be integers. If they are not, clarify how these values were computed.

Article Menu

End-to-End Lane Detection: A Two-Branch Instance Segmentation Approach

Further Information

Guidelines

MDPI Initiatives

Follow MDPI