Topology-aware hashing for effective control flow graph similarity analysis

4Citations
Citations of this article
10Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Control Flow Graph (CFG) similarity analysis is an essential technique for a variety of security analysis tasks, including malware detection and malware clustering. Even though various algorithms have been developed, existing CFG similarity analysis methods still suffer from limited efficiency, accuracy, and usability. In this paper, we propose a novel fuzzy hashing scheme called topology-aware hashing (TAH) for effective and efficient CFG similarity analysis. Given the CFGs constructed from program binaries, we extract blended n-gram graphical features of the CFGs, encode the graphical features into numeric vectors (called graph signatures), and then measure the graph similarity by comparing the graph signatures. We further employ a fuzzy hashing technique to convert the numeric graph signatures into smaller fixed-size fuzzy hash signatures for efficient similarity calculation. Our comprehensive evaluation demonstrates that TAH is more effective and efficient compared to existing CFG comparison techniques. To demonstrate the applicability of TAH to real-world security analysis tasks, we develop a binary similarity analysis tool based on TAH, and show that it outperforms existing similarity analysis tools while conducting malware clustering.

Cite

CITATION STYLE

APA

Li, Y., Jang, J., & Ou, X. (2019). Topology-aware hashing for effective control flow graph similarity analysis. In Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, LNICST (Vol. 304 LNICST, pp. 278–298). Springer. https://doi.org/10.1007/978-3-030-37228-6_14

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free