Next Article in Journal
Experimental and Numerical Study on the Strain Behavior of Buried Pipelines Subjected to an Impact Load
Previous Article in Journal
Preparation of Multicycle GO/TiO2 Composite Photocatalyst and Study on Degradation of Methylene Blue Synthetic Wastewater
Previous Article in Special Issue
VPNFilter Malware Analysis on Cyber Threat in Smart Home Network
Article Menu

Export Article

Open AccessArticle

Semantic-Based Representation Binary Clone Detection for Cross-Architectures in the Internet of Things

College of Computer, National University of Defense Technology, Changsha 410073, China
*
Author to whom correspondence should be addressed.
Appl. Sci. 2019, 9(16), 3283; https://doi.org/10.3390/app9163283
Received: 8 July 2019 / Revised: 7 August 2019 / Accepted: 8 August 2019 / Published: 10 August 2019
  |  
PDF [2390 KB, uploaded 10 August 2019]
  |  

Abstract

Code reuse is widespread in software development as well as internet of things (IoT) devices. However, code reuse introduces many problems, e.g., software plagiarism and known vulnerabilities. Solving these problems requires extensive manual reverse analysis. Fortunately, binary clone detection can help analysts mitigate manual work by matching reusable code and known parts. However, many binary clone detection methods are not robust to various compiler optimization options and different architectures. While some clone detection methods can be applied across different architectures, they rely on manual features based on human prior knowledge to generate feature vectors for assembly functions and fail to consider the internal associations between features from a semantic perspective. To address this problem, we propose and implement a prototype GeneDiff, a semantic-based representation binary clone detection approach for cross-architectures. GeneDiff utilizes a representation model based on natural language processing (NLP) to generate high-dimensional numeric vectors for each function based on the Valgrind intermediate representation (VEX) representation. This is the first work that translates assembly instructions into an intermediate representation and uses a semantic representation model to implement clone detection for cross-architectures. GeneDiff is robust to various compiler optimization options and different architectures. Compared to approaches using symbolic execution, GeneDiff is significantly more efficient and accurate. The area under the curve (AUC) of the receiver operating characteristic (ROC) of GeneDiff reaches 92.35%, which is considerably higher than the approaches that use symbolic execution. Extensive experiments indicate that GeneDiff can detect similarity with high accuracy even when the code has been compiled with different optimization options and targeted to different architectures. We also use real-world IoT firmware across different architectures as targets, therein proving the practicality of GeneDiff in being able to detect known vulnerabilities. View Full-Text
Keywords: binary clone detection; Semantic representation; cross-architectures; IoT devices; real-world vulnerabilities binary clone detection; Semantic representation; cross-architectures; IoT devices; real-world vulnerabilities
Figures

Figure 1

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).
SciFeed

Share & Cite This Article

MDPI and ACS Style

Luo, Z.; Wang, B.; Tang, Y.; Xie, W. Semantic-Based Representation Binary Clone Detection for Cross-Architectures in the Internet of Things. Appl. Sci. 2019, 9, 3283.

Show more citation formats Show less citations formats

Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Related Articles

Article Metrics

Article Access Statistics

1

Comments

[Return to top]
Appl. Sci. EISSN 2076-3417 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top