System for extracting domain topic using link analysis and searching for relevant features

Understanding the domain topic of software in terms of maintenance and reuse is important. However, the continual development of software and changes in its size make it difficult for engineers to understand. Recent research studies have sought to solve this problem by extracting the domain topic us...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of ambient intelligence and humanized computing 2024-02, Vol.15 (2), p.1429-1441
Hauptverfasser: Hwang, Sang Won, Lee, Yong Seok, Nam, Young Kwang
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Understanding the domain topic of software in terms of maintenance and reuse is important. However, the continual development of software and changes in its size make it difficult for engineers to understand. Recent research studies have sought to solve this problem by extracting the domain topic using various information searching techniques, such as latent semantic indexing (LSI) and latent dirichlet allocation (LDA), with the research on LDA-based techniques being particularly active. However, since the research has used only unstructured information, such as identifiers or notes, without including structured information, such as a method of calling information, it has caused problems in which extracted topics differ according to the program’s characteristics. This paper proposes a method of generating documents and extracting topics using both structured and unstructured information. In addition, indexes are generated based on the frequency of the identifier’s occurrence in the source code, and a system is proposed that extracts an association rule based on the method’s co-occurrence. Moreover, this paper suggests an information retrieval system that can provide highly reliable search results for user queries by combining domain topics, indexes with scores, and association rule information. We also develop the Topic EXtract And Search System (TEXAS2) system for this research and confirm high user satisfaction with the search results to their queries in a performance test.
ISSN:1868-5137
1868-5145
DOI:10.1007/s12652-018-1046-2