Next Article in Journal
Aerodynamic Optimization of a Micro Quadrotor Aircraft with Different Rotor Spacings in Hover
Next Article in Special Issue
Cybersecurity Threats Based on Machine Learning-Based Offensive Technique for Password Authentication
Previous Article in Journal
Input-to-State Stability of Variable Impedance Control for Robotic Manipulator
Previous Article in Special Issue
A New Method of Fuzzy Support Vector Machine Algorithm for Intrusion Detection
Open AccessArticle

Collecting Vulnerable Source Code from Open-Source Repositories for Dataset Generation

Grupo de Robótica, Universidad de León. Avenida Jesuitas, s/n, 24007 León, Spain
*
Author to whom correspondence should be addressed.
Appl. Sci. 2020, 10(4), 1270; https://doi.org/10.3390/app10041270
Received: 29 December 2019 / Revised: 3 February 2020 / Accepted: 7 February 2020 / Published: 13 February 2020
Different Machine Learning techniques to detect software vulnerabilities have emerged in scientific and industrial scenarios. Different actors in these scenarios aim to develop algorithms for predicting security threats without requiring human intervention. However, these algorithms require data-driven engines based on the processing of huge amounts of data, known as datasets. This paper introduces the SonarCloud Vulnerable Code Prospector for C (SVCP4C). This tool aims to collect vulnerable source code from open source repositories linked to SonarCloud, an online tool that performs static analysis and tags the potentially vulnerable code. The tool provides a set of tagged files suitable for extracting features and creating training datasets for Machine Learning algorithms. This study presents a descriptive analysis of these files and overviews current status of C vulnerabilities, specifically buffer overflow, in the reviewed public repositories. View Full-Text
Keywords: vulnerability; sonarcloud; bot; source code; repository; buffer overflow vulnerability; sonarcloud; bot; source code; repository; buffer overflow
Show Figures

Figure 1

MDPI and ACS Style

Raducu, R.; Esteban, G.; Rodríguez Lera, F.J.; Fernández, C. Collecting Vulnerable Source Code from Open-Source Repositories for Dataset Generation. Appl. Sci. 2020, 10, 1270.

Show more citation formats Show less citations formats
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Back to TopTop