SimHash-based binary code similarity comparison method

The invention relates to a binary code similarity comparison method based on SimHash, and belongs to the field of code comparison. According to the method, the binary codes are disassembled, the assembly codes are preprocessed, the assembly codes are subjected to standardization processing, the SimH...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: YASUTSUNE, ZHANG JIANWEI, FU XIUFENG, JIA ZHANGTAO, FENG DACHENG, SHAO SA, KONG XIANGBING, LIU YUBO, TAO JINLONG, JIN YUCHUAN
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The invention relates to a binary code similarity comparison method based on SimHash, and belongs to the field of code comparison. According to the method, the binary codes are disassembled, the assembly codes are preprocessed, the assembly codes are subjected to standardization processing, the SimHash values of the assembly codes are calculated, a code feature relation library framework is constructed, and the binary codes are rapidly positioned based on text similarity. The binary code similarity comparison method has the following advantages: the scheme provided by the invention can ensure the efficiency of binary code similarity comparison while giving consideration to the comparison efficiency; according to the method, the text comparison method based on SimHash is adopted, and the binary code similarity comparison efficiency can be improved. 本发明涉及一种基于SimHash的二进制代码相似性比对方法,属于代码比对领域。本发明对二进制代码反汇编及汇编代码预处理,对汇编代码标准化处理,计算汇编代码SimHash值,构建代码特征关系库构架,基于文本相似性的二进制代码快速定位。本发明具有以下优点:本发明提出的方案,能够在兼顾对比效率的同时,保证二进制代码相似性比对的效率;