Code clone detection method and system based on abstract syntax tree optimization and multi-representation
The invention discloses a code clone detection method and system based on abstract syntax tree optimization and multi-representation. The method comprises the following steps: compiling a code text to obtain a corresponding abstract syntax tree; optimizing the abstract syntax tree, including removin...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Patent |
Sprache: | chi ; eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The invention discloses a code clone detection method and system based on abstract syntax tree optimization and multi-representation. The method comprises the following steps: compiling a code text to obtain a corresponding abstract syntax tree; optimizing the abstract syntax tree, including removing nodes generated by a compiler and recovery nodes of compilation errors, removing declaration nodes and constant nodes, refining expression nodes, and respectively converting a selection structure and a loop structure into corresponding unified sub-tree structures; traversing the optimized abstract syntax tree to obtain a front sequence and a rear sequence; inputting the two sequences into a Transform network, and outputting a feature fingerprint corresponding to the code text; obtaining a plurality of corresponding feature fingerprints according to the plurality of code texts; and if the cosine similarity of any two feature fingerprints is greater than a first set threshold, determining that the two text codes co |
---|