Ask Me Any Type: Type Inference Plugin for Partial Code on the Web and in the Integrated Development Environment

Inferring the fully qualified names (FQNs) of undeclared receiving objects and non-fully-qualified type names (non-FQNs) in partial code is critical for effectively searching, understanding, and reusing partial code. Existing type inference tools, such as COSTER and SNR, rely on a symbolic knowledge...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Wuhan University journal of natural sciences 2024-08, Vol.29 (4), p.349-356
Hauptverfasser: CHENG, Yu, HUANG, Guanming, WU, Yishun, ZHAO, Zijie, HE, Zhenhao, LU, Jiaxing
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Inferring the fully qualified names (FQNs) of undeclared receiving objects and non-fully-qualified type names (non-FQNs) in partial code is critical for effectively searching, understanding, and reusing partial code. Existing type inference tools, such as COSTER and SNR, rely on a symbolic knowledge base and adopt a dictionary-lookup strategy to map simple names of undeclared receiving objects and non-FQNs to FQNs. However, building a symbolic knowledge base requires parsing compilable code files, which limits the collection of APIs and code contexts, resulting in out-of-vocabulary (OOV) failures. To overcome the limitations of a symbolic knowledge base for FQN inference, we implemented Ask Me Any Type (AMAT), a type of inference plugin embedded in web browsers and integrated development environment (IDE). Unlike the dictionary-lookup strategy, AMAT uses a cloze-style fill-in-the-blank strategy for type inference. By treating code as text, AMAT leverages a fine-tuned large language model (LLM) as a neural knowledge base, thereby preventing the need for code compilation. Experimental results show that AMAT outperforms state-of-the-art tools such as COSTER and SNR. In practice, developers can directly reuse partial code by inferring the FQNs of unresolved type names in real time. 推理代码片段中未声明的接收对象和非完全限定类型名称(非FQNs)的完全限定名称(FQNs)对于有效搜索、理解和重用代码片段至关重要。现有的类型推断工具,如COSTER和SNR,依赖于符号知识库并采用字典查找策略,将未声明的接收对象和非FQNs的简单名称映射到FQNs。然而,构建符号知识库需要解析可编译的代码文件,它限制了API和代码上下文的收集,导致待搜索的FQN不在符号知识库范围。为克服符号知识库在FQN推理中的局限性,本文实现了一种嵌入Web浏览器和集成开发环境(IDE)的类型推理插件——Ask-Me-Any-Type(AMAT)。AMAT使用填空式策略而不是字典查找策略进行类型推理,通过将代码视为文本,把经过微调的大型语言模型(LLM)作为神经知识库,避免了代码编译的需要。实验结果表明,AMAT的性能优于COSTER和SNR等工具。在实践中,开发人员可以运用AMAT实时推理未解析类型名称的FQNs,直接重用代码片段。
ISSN:1007-1202
1993-4998
DOI:10.1051/wujns/2024294349