Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Towards a Block-Level Conformer-Based Python Vulnerability Detection

Software 2024, 3(3), 310-327; https://doi.org/10.3390/software3030016

by Amirreza Bagheri

and Péter Hegedűs^*

Reviewer 1: Anonymous

Reviewer 2:

Dalibor Dobrilovic

Reviewer 3:

Shengwen Li

Software 2024, 3(3), 310-327; https://doi.org/10.3390/software3030016

Submission received: 19 June 2024 / Revised: 23 July 2024 / Accepted: 29 July 2024 / Published: 31 July 2024

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

1. References should be updated. For example, VCCFinder is not new as pointed out by the authors.

2. Check for grammatical inconsistencies. Example: Line 65 should begin with 'Do large language models...'

3. Add a table summarizing papers discussed in Section 2.

4. Section 3.1 'Contro' should be 'Control'

5. Table 1 is confusing to understand. What is 'Com.'? You need to give reference to the places these numbers were obtained from.

6. Line 219 - Capitalize 'our'

7. Figure 1 has no discussion in the document. Authors should explain why its up there. What exactly is the function of the 'for' loop? What exactly does each abbreviation represent.

8. Line 263 - The authors should refrain from using generic words like 'sorts of graphs'. Please be specific about your research and how many graphs were generated.

9. Lines 304 to 341 can be shortened. At this stage most readers and researchers are familiar with internal workings of transformers and CNNs.

10. The authors should provide clear discussion about the outcome of web scraping using GitHub detailing statistics about the number of projects, datasets etc that were returned as output for Section 5.1.1

11. GitHub or 'github'? Authors should be consisitent

12. Please elaborate 'alternative approaches' introduced in lines 490.

13. There are several sub-sections in the document making it feel more like a report. The authors are encouraged to reconsider this approach.

14. The authors are encouraged to explain Figure 1 extensively w.r.t. to their dataset transformations. So far, the explanations have been very generic. Data transformation is a crucial discussion that is missing in this paper.

15. The authors are highly encouraged to share their research using GitHub repo.

Comments on the Quality of English Language

The authors need to do extensive editing as major grammatical errors can be found. Inconsistencies associated with words are also present (data mining, dataming, GitHub, github etc.). There are also places where the text ends abruptly.

Author Response

1. References should be updated. For example, VCCFinder is not new as pointed out by the authors.

reply : the part vector machine that we metioned was the new part and we refereced it , what part of it should not we do . we know the method is not new but still some part of it are .

2. Check for grammatical inconsistencies. Example: Line 65 should begin with 'Do large language models...'

reply : we fix this and try again with the grammer .

3. Add a table summarizing papers discussed in Section 2.

reply : done

4. Section 3.1 'Contro' should be 'Control'

reply : fixed

5. Table 1 is confusing to understand. What is 'Com.'? You need to give reference to the places these numbers were obtained from.

reply : we fixed the mentioned confusion and referenced it to our github repository .

6. Line 219 - Capitalize 'our'

reply : fixed

7. Figure 1 has no discussion in the document. Authors should explain why its up there. What exactly is the function of the 'for' loop? What exactly does each abbreviation represent.

reply : we add the explanation in begining of the section with a little change in the fig for better understanding but it basically the design of the model and each part of it has been explain in the section 3 .

8. Line 263 - The authors should refrain from using generic words like 'sorts of graphs'. Please be specific about your research and how many graphs were generated.

reply : fixed

9. Lines 304 to 341 can be shortened. At this stage most readers and researchers are familiar with internal workings of transformers and CNNs.

reply : these are the conformers and the changes that we made on it , i try to shoreten it as much as i can but most of it is needed .

reply : the data mining processing was explaining here and the exact numbers are showen in table 2 , we added a reference again to table 2 and the github repository for the data set itself .

11. GitHub or 'github'? Authors should be consisitent

reply : fixed

12. Please elaborate 'alternative approaches' introduced in lines 490.

reply : we referenced it to table 4

13. There are several sub-sections in the document making it feel more like a report. The authors are encouraged to reconsider this approach.

reply : will do .

reply : the whole process was explained in implementation section we will do change part of it for better understanding .

15. The authors are highly encouraged to share their research using GitHub repo.

reply : we already put the repo. address in the footer in introduction . Comments on the Quality of English Language

reply : fixed

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

This paper introduces an innovative model for vulnerability identification that utilizes code analysis techniques. This method combines self-attention with convolutional networks to collect both localized, position-specific features and global, content-driven interactions. The results justified the proposed method.

This paper is well-structured, and the reference list is up-to-date and adequate. The experiment is well explained. However, this paper has several segments that should be improved:

11. The data set is not explained well. There is no explanation of the table captions in Table 1 (Repo. Com. Files Func. LOC). It is not explained how the data are collected. What is the data source: public or private data? What were the criteria for choosing the data?

22. Fig. 1 details are too small, and details are not readable.

33. Fig. 2 has no caption and no reference in the text.

44. Although the experiment is well explained, the authors should give more details on HPE called Komondor and HPE Apollo 6500 Gen10Plus Blades, with 1 node per blade, with 8 GPUs giving the GPU details if possible.

55. What are the criteria for choosing GitHub projects for the dataset?

66. More details about why “the bits of code that were altered or deleted in such a commit can be labeled as vulnerable, and the version after the fix, as well as all the data around the affected component, can be labeled as (potentially) not vulnerable” are needed. Why the all changes labeled as vulnerability fixing, and not as bug fixing?

77. Authors should better explain the labeling process. It is not clear how it is recognizable in the automatized process what is the vulnerable part of the code and what is not. How it is known that a portion of the code is vulnerable or not? How data processing involves equally splitting data into vulnerable and non-vulnerable parts until labeling. How sophisticated method of processing source code into blocks is performed?

88. Authors claim “The table demonstrates that the presence of structural information has a significant influence on enhancing performance. Eliminating any of the graphs (AST, CFG, or DFG) led to consistent decreases in accuracy and F1-score.“ The subsequent tables justify this claim. Because of that, it is important to better explain the inclusion of structural information in this proposal given in subsection 3.1,

99. The same is with Conformer and LLM in subsections 3.3 and 3.4,

110. Authors claim “Furthermore, FUNDED performs better than CodeBERT”, but it seems it is not presented in Table 3,

111. It is not completely clear whether the performance metrics presented in Tables 3 and 4 for other methods are taken from the literature sources or derived from the experiments performed by authors.

112. There are numerous misspellings and errors: Contro and Data Flow Graphs (should be Control), our strategy (small first letter in the sentence), unlikely to be insufficient for this application (small first letter in the sentence), input sequence.A convolutional (space is missing), Vunlnerability, construct both Flow Graphs (CFG) (Control is missing), and numerous other errors … The text be carefully rechecked, and typing and formatting errors should be corrected.

Author Response

This paper is well-structured, and the reference list is up-to-date and adequate. The experiment is well explained. However, this paper has several segments that should be improved:

The data set is not explained well. There is no explanation of the table captions in Table 1 (Repo. Com. Files Func. LOC). It is not explained how the data are collected. What is the data source: public or private data? What were the criteria for choosing the data?

reply : we explain it compeletly in the implamentation section and changed the table details too .

Fig. 1 details are too small, and details are not readable.

reply : we changed teh figure 1 and make it bigger and more readable .

Fig. 2 has no caption and no reference in the text.

reply : we added caption and reference .

Although the experiment is well explained, the authors should give more details on HPE called Komondor and HPE Apollo 6500 Gen10Plus Blades, with 1 node per blade, with 8 GPUs giving the GPU details if possible.

reply : we added the full details for the machine that been used .

What are the criteria for choosing GitHub projects for the dataset?

reply : it started the search based on the different vulnerablities with keywords that we are looking for and it was exaplained in the implementation section .

More details about why “the bits of code that were altered or deleted in such a commit can be labeled as vulnerable, and the version after the fix, as well as all the data around the affected component, can be labeled as (potentially) not vulnerable” are needed. Why the all changes labeled as vulnerability fixing, and not as bug fixing?

reply : we explain it more and also referenced it to our previous paper , however if you still think that it need more details we can add more for sure .

Authors should better explain the labeling process. It is not clear how it is recognizable in the automatized process what is the vulnerable part of the code and what is not. How it is known that a portion of the code is vulnerable or not? How data processing involves equally splitting data into vulnerable and non-vulnerable parts until labeling. How sophisticated method of processing source code into blocks is performed?

reply : we explain it more and also referenced it to our previous paper , however if you still think that it need more details we can add more for sure .

Authors claim “The table demonstrates that the presence of structural information has a significant influence on enhancing performance. Eliminating any of the graphs (AST, CFG, or DFG) led to consistent decreases in accuracy and F1-score.“ The subsequent tables justify this claim. Because of that, it is important to better explain the inclusion of structural information in this proposal given in subsection 3.1,

replay : we added more details about this in Approach section , there we explain more about the small parts .

The same is with Conformer and LLM in subsections 3.3 and 3.4,

replay : same with last part we explain all details and changed on conformer becuase its one of the most important on Approach section and for LLM in implmentation

Authors claim “Furthermore, FUNDED performs better than CodeBERT”, but it seems it is not presented in Table 3,

replay : fixed

It is not completely clear whether the performance metrics presented in Tables 3 and 4 for other methods are taken from the literature sources or derived from the experiments performed by authors.

reply : fixed

There are numerous misspellings and errors: Contro and Data Flow Graphs (should be Control), our strategy (small first letter in the sentence), unlikely to be insufficient for this application (small first letter in the sentence), input sequence.A convolutional (space is missing), Vunlnerability, construct both Flow Graphs (CFG) (Control is missing), and numerous other errors … The text be carefully rechecked, and typing and formatting errors should be corrected.

reply : fixed

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

1. The problem to addressed is unclear.

2. Figure 2 is not clear enough to meet the needs of publication, and the caption text is missing.

3. The ablation experiment should more accurately reflect the contribution of each module, rather than emphasizing the completeness of the model in this paper. The sentence “This emphasizes the significance of preserving the model’s structural integrity.” Should be rephrased.

4. The results of the experiment are particularly confusing and lack logic. In particular, the objectives to be verified in Tables 3 and 4 are difficult to follow.

5. Of the three research questions raised in this paper, " Does increasing the amount of training data have a positive impact on the performance of models, or have they reached their maximum potential?" was not answered. And, the other two questions were only partially verified.

6. The contribution of this paper is not well described.

Author Response

The problem to addressed is unclear.

reply : the problems addressed as RQs in the paper and mostly it follows our previous research and try to solve them .

Figure 2 is not clear enough to meet the needs of publication, and the caption text is missing.

reply : we change both figures to another format for better quality and to be ready for publication .

The ablation experiment should more accurately reflect the contribution of each module, rather than emphasizing the completeness of the model in this paper.

reply : we add details about that part for more emphasizing on the subjects

The sentence “This emphasizes the significance of preserving the model’s structural integrity.” Should be rephrased.

reply : fixed

The results of the experiment are particularly confusing and lack logic. In particular, the objectives to be verified in Tables 3 and 4 are difficult to follow.

reply : because of the difference in code language we could not do the same test with same data set to completely show the results are for sure caused by different in the method so we conduct different experiments one with same data set and one with custom data set but with same data mining method and we try to explain this in paper as much as possible .

Of the three research questions raised in this paper, " Does increasing the amount of training data have a positive impact on the performance of models, or have they reached their maximum potential?" was not answered. And, the other two questions were only partially verified.

replay : we added a more detailed answers to the questions to explain them more efficient.

The contribution of this paper is not well described

reply : i dont know what contribution you mean if you can explain more i will be add this .

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

Multiple grammatical error still exists. Example is the Table 4 caption. ‘comaprison’ should be comparison and dataminig should be data mining. The authors are expected to review the entire document for errors instead of just corrected what the reviewers pointed out.

Comments on the Quality of English Language

Need improvement

Author Response

Thank you so much for the review.

We check the whole paper again with grammar, and hopefully it does not have any misspellings in it.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

The authors have successfully responded to all my comments and therefore I recommend this publication for publishing.

In 4.4 is one unnumbered figure "Figure ?? displays the different" that should be corrected.

Author Response

thank you so much for the review .

In 4.4 is one unnumbered figure "Figure ?? displays the different" that should be corrected.

reply : we fix that problem too .

regards

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

1. If there is only one subsection, it is not appropriate to label it to 1.1.

2. The revised manuscript still does not clearly state research problem and contribution. A problem statement is an explanation in research that describes the issue that is in need of study. Research contribution encapsulates the novel insights, advancements, or solutions that your study promises to offer to the field.

3. There are still many problems with English writing, including grammar and typography.

Comments on the Quality of English Language

There are still many problems with English writing, including grammar and typography.

Author Response

We have carefully addressed your comments and made significant improvements to the manuscript.In response to your feedback, we have clearly stated the research problem and contribution in each section :

Introduction: Emphasized the primary research problem and our contribution to improving the detection accuracy and efficiency of vulnerability identification in Python code.
Related Work: Identified the inefficiency and inaccuracy of conventional vulnerability detection methods and introduced our machine learning-based approach as a significant contribution.
Aproach : Addressed the challenge of preprocessing and analyzing large datasets of raw source code and presented our comprehensive model as a solution.
Experimental Results: Validated the effectiveness of our proposed model in real-world scenarios and demonstrated its superior performance compared to existing methods.
Conclusion: Summarized the research problem and our contribution, highlighting the implications and future opportunities for our work.

We checked for all grammer problems and also fix the quality problem with the images

We believe that these revisions have significantly strengthened the manuscript and addressed the reviewers' concerns. We look forward to your positive response and are happy to provide any additional information or clarifications as needed.

Thank you for considering our revised manuscript for publication.

Author Response File: Author Response.pdf

Round 3

Reviewer 3 Report

Comments and Suggestions for Authors

The revised manuscript solved my questions.

Article Menu

Towards a Block-Level Conformer-Based Python Vulnerability Detection

Further Information

Guidelines

MDPI Initiatives

Follow MDPI