System for extracting domain topic using link analysis and searching for relevant features

Understanding the domain topic of software in terms of maintenance and reuse is important. However, the continual development of software and changes in its size make it difficult for engineers to understand. Recent research studies have sought to solve this problem by extracting the domain topic us...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of ambient intelligence and humanized computing 2024-02, Vol.15 (2), p.1429-1441
Hauptverfasser: Hwang, Sang Won, Lee, Yong Seok, Nam, Young Kwang
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 1441
container_issue 2
container_start_page 1429
container_title Journal of ambient intelligence and humanized computing
container_volume 15
creator Hwang, Sang Won
Lee, Yong Seok
Nam, Young Kwang
description Understanding the domain topic of software in terms of maintenance and reuse is important. However, the continual development of software and changes in its size make it difficult for engineers to understand. Recent research studies have sought to solve this problem by extracting the domain topic using various information searching techniques, such as latent semantic indexing (LSI) and latent dirichlet allocation (LDA), with the research on LDA-based techniques being particularly active. However, since the research has used only unstructured information, such as identifiers or notes, without including structured information, such as a method of calling information, it has caused problems in which extracted topics differ according to the program’s characteristics. This paper proposes a method of generating documents and extracting topics using both structured and unstructured information. In addition, indexes are generated based on the frequency of the identifier’s occurrence in the source code, and a system is proposed that extracts an association rule based on the method’s co-occurrence. Moreover, this paper suggests an information retrieval system that can provide highly reliable search results for user queries by combining domain topics, indexes with scores, and association rule information. We also develop the Topic EXtract And Search System (TEXAS2) system for this research and confirm high user satisfaction with the search results to their queries in a performance test.
doi_str_mv 10.1007/s12652-018-1046-2
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2933288419</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2933288419</sourcerecordid><originalsourceid>FETCH-LOGICAL-c2312-fb54ae84f33302f8d02734e28c44cbd69e1f073cfe4c6a0d0238162f11a3c1be3</originalsourceid><addsrcrecordid>eNp1UE1LxDAQDaLgsu4P8BbwXM0kaZseZfELFjyoFy8hm07Wrt12TVJx_70pFT05l3nM-2B4hJwDuwTGyqsAvMh5xkBlwGSR8SMyA1WoLAeZH_9iUZ6SRQhblkZUAgBm5PXpECLuqOs9xa_ojY1Nt6F1vzNNR2O_bywdwnhqm-6dms60h9CEBGoa0Hj7NnKj22OLn6aL1KGJg8dwRk6caQMufvacvNzePC_vs9Xj3cPyepVZLoBnbp1Lg0o6IQTjTtWMl0IiV1ZKu66LCsGxUliH0haGJVooKLgDMMLCGsWcXEy5e99_DBii3vaDT48GzSshuFISqqSCSWV9H4JHp_e-2Rl_0MD02KKeWtSpRT22qHny8MkTkrbboP9L_t_0DaY9dQc</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2933288419</pqid></control><display><type>article</type><title>System for extracting domain topic using link analysis and searching for relevant features</title><source>SpringerLink Journals - AutoHoldings</source><source>ProQuest Central</source><creator>Hwang, Sang Won ; Lee, Yong Seok ; Nam, Young Kwang</creator><creatorcontrib>Hwang, Sang Won ; Lee, Yong Seok ; Nam, Young Kwang</creatorcontrib><description>Understanding the domain topic of software in terms of maintenance and reuse is important. However, the continual development of software and changes in its size make it difficult for engineers to understand. Recent research studies have sought to solve this problem by extracting the domain topic using various information searching techniques, such as latent semantic indexing (LSI) and latent dirichlet allocation (LDA), with the research on LDA-based techniques being particularly active. However, since the research has used only unstructured information, such as identifiers or notes, without including structured information, such as a method of calling information, it has caused problems in which extracted topics differ according to the program’s characteristics. This paper proposes a method of generating documents and extracting topics using both structured and unstructured information. In addition, indexes are generated based on the frequency of the identifier’s occurrence in the source code, and a system is proposed that extracts an association rule based on the method’s co-occurrence. Moreover, this paper suggests an information retrieval system that can provide highly reliable search results for user queries by combining domain topics, indexes with scores, and association rule information. We also develop the Topic EXtract And Search System (TEXAS2) system for this research and confirm high user satisfaction with the search results to their queries in a performance test.</description><identifier>ISSN: 1868-5137</identifier><identifier>EISSN: 1868-5145</identifier><identifier>DOI: 10.1007/s12652-018-1046-2</identifier><language>eng</language><publisher>Berlin/Heidelberg: Springer Berlin Heidelberg</publisher><subject>Algorithms ; Artificial Intelligence ; Computational Intelligence ; Data mining ; Dirichlet problem ; Engineering ; Engineers ; Information retrieval ; Methods ; Names ; Natural language ; Original Research ; Performance indices ; Performance tests ; Queries ; Robotics and Automation ; Searching ; Software ; Source code ; Unstructured data ; User Interfaces and Human Computer Interaction ; User satisfaction</subject><ispartof>Journal of ambient intelligence and humanized computing, 2024-02, Vol.15 (2), p.1429-1441</ispartof><rights>Springer-Verlag GmbH Germany, part of Springer Nature 2018</rights><rights>Springer-Verlag GmbH Germany, part of Springer Nature 2018.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c2312-fb54ae84f33302f8d02734e28c44cbd69e1f073cfe4c6a0d0238162f11a3c1be3</citedby><cites>FETCH-LOGICAL-c2312-fb54ae84f33302f8d02734e28c44cbd69e1f073cfe4c6a0d0238162f11a3c1be3</cites><orcidid>0000-0001-8666-7479</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s12652-018-1046-2$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/2933288419?pq-origsite=primo$$EHTML$$P50$$Gproquest$$H</linktohtml><link.rule.ids>314,776,780,21367,27901,27902,33721,41464,42533,43781,51294</link.rule.ids></links><search><creatorcontrib>Hwang, Sang Won</creatorcontrib><creatorcontrib>Lee, Yong Seok</creatorcontrib><creatorcontrib>Nam, Young Kwang</creatorcontrib><title>System for extracting domain topic using link analysis and searching for relevant features</title><title>Journal of ambient intelligence and humanized computing</title><addtitle>J Ambient Intell Human Comput</addtitle><description>Understanding the domain topic of software in terms of maintenance and reuse is important. However, the continual development of software and changes in its size make it difficult for engineers to understand. Recent research studies have sought to solve this problem by extracting the domain topic using various information searching techniques, such as latent semantic indexing (LSI) and latent dirichlet allocation (LDA), with the research on LDA-based techniques being particularly active. However, since the research has used only unstructured information, such as identifiers or notes, without including structured information, such as a method of calling information, it has caused problems in which extracted topics differ according to the program’s characteristics. This paper proposes a method of generating documents and extracting topics using both structured and unstructured information. In addition, indexes are generated based on the frequency of the identifier’s occurrence in the source code, and a system is proposed that extracts an association rule based on the method’s co-occurrence. Moreover, this paper suggests an information retrieval system that can provide highly reliable search results for user queries by combining domain topics, indexes with scores, and association rule information. We also develop the Topic EXtract And Search System (TEXAS2) system for this research and confirm high user satisfaction with the search results to their queries in a performance test.</description><subject>Algorithms</subject><subject>Artificial Intelligence</subject><subject>Computational Intelligence</subject><subject>Data mining</subject><subject>Dirichlet problem</subject><subject>Engineering</subject><subject>Engineers</subject><subject>Information retrieval</subject><subject>Methods</subject><subject>Names</subject><subject>Natural language</subject><subject>Original Research</subject><subject>Performance indices</subject><subject>Performance tests</subject><subject>Queries</subject><subject>Robotics and Automation</subject><subject>Searching</subject><subject>Software</subject><subject>Source code</subject><subject>Unstructured data</subject><subject>User Interfaces and Human Computer Interaction</subject><subject>User satisfaction</subject><issn>1868-5137</issn><issn>1868-5145</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>BENPR</sourceid><recordid>eNp1UE1LxDAQDaLgsu4P8BbwXM0kaZseZfELFjyoFy8hm07Wrt12TVJx_70pFT05l3nM-2B4hJwDuwTGyqsAvMh5xkBlwGSR8SMyA1WoLAeZH_9iUZ6SRQhblkZUAgBm5PXpECLuqOs9xa_ojY1Nt6F1vzNNR2O_bywdwnhqm-6dms60h9CEBGoa0Hj7NnKj22OLn6aL1KGJg8dwRk6caQMufvacvNzePC_vs9Xj3cPyepVZLoBnbp1Lg0o6IQTjTtWMl0IiV1ZKu66LCsGxUliH0haGJVooKLgDMMLCGsWcXEy5e99_DBii3vaDT48GzSshuFISqqSCSWV9H4JHp_e-2Rl_0MD02KKeWtSpRT22qHny8MkTkrbboP9L_t_0DaY9dQc</recordid><startdate>20240201</startdate><enddate>20240201</enddate><creator>Hwang, Sang Won</creator><creator>Lee, Yong Seok</creator><creator>Nam, Young Kwang</creator><general>Springer Berlin Heidelberg</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>8FE</scope><scope>8FG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>P5Z</scope><scope>P62</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><orcidid>https://orcid.org/0000-0001-8666-7479</orcidid></search><sort><creationdate>20240201</creationdate><title>System for extracting domain topic using link analysis and searching for relevant features</title><author>Hwang, Sang Won ; Lee, Yong Seok ; Nam, Young Kwang</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c2312-fb54ae84f33302f8d02734e28c44cbd69e1f073cfe4c6a0d0238162f11a3c1be3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Algorithms</topic><topic>Artificial Intelligence</topic><topic>Computational Intelligence</topic><topic>Data mining</topic><topic>Dirichlet problem</topic><topic>Engineering</topic><topic>Engineers</topic><topic>Information retrieval</topic><topic>Methods</topic><topic>Names</topic><topic>Natural language</topic><topic>Original Research</topic><topic>Performance indices</topic><topic>Performance tests</topic><topic>Queries</topic><topic>Robotics and Automation</topic><topic>Searching</topic><topic>Software</topic><topic>Source code</topic><topic>Unstructured data</topic><topic>User Interfaces and Human Computer Interaction</topic><topic>User satisfaction</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Hwang, Sang Won</creatorcontrib><creatorcontrib>Lee, Yong Seok</creatorcontrib><creatorcontrib>Nam, Young Kwang</creatorcontrib><collection>CrossRef</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>Advanced Technologies &amp; Aerospace Database</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><jtitle>Journal of ambient intelligence and humanized computing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Hwang, Sang Won</au><au>Lee, Yong Seok</au><au>Nam, Young Kwang</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>System for extracting domain topic using link analysis and searching for relevant features</atitle><jtitle>Journal of ambient intelligence and humanized computing</jtitle><stitle>J Ambient Intell Human Comput</stitle><date>2024-02-01</date><risdate>2024</risdate><volume>15</volume><issue>2</issue><spage>1429</spage><epage>1441</epage><pages>1429-1441</pages><issn>1868-5137</issn><eissn>1868-5145</eissn><abstract>Understanding the domain topic of software in terms of maintenance and reuse is important. However, the continual development of software and changes in its size make it difficult for engineers to understand. Recent research studies have sought to solve this problem by extracting the domain topic using various information searching techniques, such as latent semantic indexing (LSI) and latent dirichlet allocation (LDA), with the research on LDA-based techniques being particularly active. However, since the research has used only unstructured information, such as identifiers or notes, without including structured information, such as a method of calling information, it has caused problems in which extracted topics differ according to the program’s characteristics. This paper proposes a method of generating documents and extracting topics using both structured and unstructured information. In addition, indexes are generated based on the frequency of the identifier’s occurrence in the source code, and a system is proposed that extracts an association rule based on the method’s co-occurrence. Moreover, this paper suggests an information retrieval system that can provide highly reliable search results for user queries by combining domain topics, indexes with scores, and association rule information. We also develop the Topic EXtract And Search System (TEXAS2) system for this research and confirm high user satisfaction with the search results to their queries in a performance test.</abstract><cop>Berlin/Heidelberg</cop><pub>Springer Berlin Heidelberg</pub><doi>10.1007/s12652-018-1046-2</doi><tpages>13</tpages><orcidid>https://orcid.org/0000-0001-8666-7479</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 1868-5137
ispartof Journal of ambient intelligence and humanized computing, 2024-02, Vol.15 (2), p.1429-1441
issn 1868-5137
1868-5145
language eng
recordid cdi_proquest_journals_2933288419
source SpringerLink Journals - AutoHoldings; ProQuest Central
subjects Algorithms
Artificial Intelligence
Computational Intelligence
Data mining
Dirichlet problem
Engineering
Engineers
Information retrieval
Methods
Names
Natural language
Original Research
Performance indices
Performance tests
Queries
Robotics and Automation
Searching
Software
Source code
Unstructured data
User Interfaces and Human Computer Interaction
User satisfaction
title System for extracting domain topic using link analysis and searching for relevant features
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-05T03%3A26%3A39IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=System%20for%20extracting%20domain%20topic%20using%20link%20analysis%20and%20searching%20for%20relevant%20features&rft.jtitle=Journal%20of%20ambient%20intelligence%20and%20humanized%20computing&rft.au=Hwang,%20Sang%20Won&rft.date=2024-02-01&rft.volume=15&rft.issue=2&rft.spage=1429&rft.epage=1441&rft.pages=1429-1441&rft.issn=1868-5137&rft.eissn=1868-5145&rft_id=info:doi/10.1007/s12652-018-1046-2&rft_dat=%3Cproquest_cross%3E2933288419%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2933288419&rft_id=info:pmid/&rfr_iscdi=true