System for extracting domain topic using link analysis and searching for relevant features

Understanding the domain topic of software in terms of maintenance and reuse is important. However, the continual development of software and changes in its size make it difficult for engineers to understand. Recent research studies have sought to solve this problem by extracting the domain topic us...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Journal of ambient intelligence and humanized computing 2024-02, Vol.15 (2), p.1429-1441
Hauptverfasser:	Hwang, Sang Won, Lee, Yong Seok, Nam, Young Kwang
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Artificial Intelligence Computational Intelligence Data mining Dirichlet problem Engineering Engineers Information retrieval Methods Names Natural language Original Research Performance indices Performance tests Queries Robotics and Automation Searching Software Source code Unstructured data User Interfaces and Human Computer Interaction User satisfaction
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	1441
container_issue	2
container_start_page	1429
container_title	Journal of ambient intelligence and humanized computing
container_volume	15
creator	Hwang, Sang Won Lee, Yong Seok Nam, Young Kwang
description	Understanding the domain topic of software in terms of maintenance and reuse is important. However, the continual development of software and changes in its size make it difficult for engineers to understand. Recent research studies have sought to solve this problem by extracting the domain topic using various information searching techniques, such as latent semantic indexing (LSI) and latent dirichlet allocation (LDA), with the research on LDA-based techniques being particularly active. However, since the research has used only unstructured information, such as identifiers or notes, without including structured information, such as a method of calling information, it has caused problems in which extracted topics differ according to the program’s characteristics. This paper proposes a method of generating documents and extracting topics using both structured and unstructured information. In addition, indexes are generated based on the frequency of the identifier’s occurrence in the source code, and a system is proposed that extracts an association rule based on the method’s co-occurrence. Moreover, this paper suggests an information retrieval system that can provide highly reliable search results for user queries by combining domain topics, indexes with scores, and association rule information. We also develop the Topic EXtract And Search System (TEXAS2) system for this research and confirm high user satisfaction with the search results to their queries in a performance test.
doi_str_mv	10.1007/s12652-018-1046-2
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2933288419</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2933288419</sourcerecordid><originalsourceid>FETCH-LOGICAL-c2312-fb54ae84f33302f8d02734e28c44cbd69e1f073cfe4c6a0d0238162f11a3c1be3</originalsourceid><addsrcrecordid>eNp1UE1LxDAQDaLgsu4P8BbwXM0kaZseZfELFjyoFy8hm07Wrt12TVJx_70pFT05l3nM-2B4hJwDuwTGyqsAvMh5xkBlwGSR8SMyA1WoLAeZH_9iUZ6SRQhblkZUAgBm5PXpECLuqOs9xa_ojY1Nt6F1vzNNR2O_bywdwnhqm-6dms60h9CEBGoa0Hj7NnKj22OLn6aL1KGJg8dwRk6caQMufvacvNzePC_vs9Xj3cPyepVZLoBnbp1Lg0o6IQTjTtWMl0IiV1ZKu66LCsGxUliH0haGJVooKLgDMMLCGsWcXEy5e99_DBii3vaDT48GzSshuFISqqSCSWV9H4JHp_e-2Rl_0MD02KKeWtSpRT22qHny8MkTkrbboP9L_t_0DaY9dQc</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2933288419</pqid></control><display><type>article</type><title>System for extracting domain topic using link analysis and searching for relevant features</title><source>SpringerLink Journals - AutoHoldings</source><source>ProQuest Central</source><creator>Hwang, Sang Won ; Lee, Yong Seok ; Nam, Young Kwang</creator><creatorcontrib>Hwang, Sang Won ; Lee, Yong Seok ; Nam, Young Kwang</creatorcontrib><description>Understanding the domain topic of software in terms of maintenance and reuse is important. However, the continual development of software and changes in its size make it difficult for engineers to understand. Recent research studies have sought to solve this problem by extracting the domain topic using various information searching techniques, such as latent semantic indexing (LSI) and latent dirichlet allocation (LDA), with the research on LDA-based techniques being particularly active. However, since the research has used only unstructured information, such as identifiers or notes, without including structured information, such as a method of calling information, it has caused problems in which extracted topics differ according to the program’s characteristics. This paper proposes a method of generating documents and extracting topics using both structured and unstructured information. In addition, indexes are generated based on the frequency of the identifier’s occurrence in the source code, and a system is proposed that extracts an association rule based on the method’s co-occurrence. Moreover, this paper suggests an information retrieval system that can provide highly reliable search results for user queries by combining domain topics, indexes with scores, and association rule information. We also develop the Topic EXtract And Search System (TEXAS2) system for this research and confirm high user satisfaction with the search results to their queries in a performance test.</description><identifier>ISSN: 1868-5137</identifier><identifier>EISSN: 1868-5145</identifier><identifier>DOI: 10.1007/s12652-018-1046-2</identifier><language>eng</language><publisher>Berlin/Heidelberg: Springer Berlin Heidelberg</publisher><subject>Algorithms ; Artificial Intelligence ; Computational Intelligence ; Data mining ; Dirichlet problem ; Engineering ; Engineers ; Information retrieval ; Methods ; Names ; Natural language ; Original Research ; Performance indices ; Performance tests ; Queries ; Robotics and Automation ; Searching ; Software ; Source code ; Unstructured data ; User Interfaces and Human Computer Interaction ; User satisfaction</subject><ispartof>Journal of ambient intelligence and humanized computing, 2024-02, Vol.15 (2), p.1429-1441</ispartof><rights>Springer-Verlag GmbH Germany, part of Springer Nature 2018</rights><rights>Springer-Verlag GmbH Germany, part of Springer Nature 2018.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c2312-fb54ae84f33302f8d02734e28c44cbd69e1f073cfe4c6a0d0238162f11a3c1be3</citedby><cites>FETCH-LOGICAL-c2312-fb54ae84f33302f8d02734e28c44cbd69e1f073cfe4c6a0d0238162f11a3c1be3</cites><orcidid>0000-0001-8666-7479</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s12652-018-1046-2$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/2933288419?pq-origsite=primo$$EHTML$$P50$$Gproquest$$H</linktohtml><link.rule.ids>314,776,780,21367,27901,27902,33721,41464,42533,43781,51294</link.rule.ids></links><search><creatorcontrib>Hwang, Sang Won</creatorcontrib><creatorcontrib>Lee, Yong Seok</creatorcontrib><creatorcontrib>Nam, Young Kwang</creatorcontrib><title>System for extracting domain topic using link analysis and searching for relevant features</title><title>Journal of ambient intelligence and humanized computing</title><addtitle>J Ambient Intell Human Comput</addtitle><description>Understanding the domain topic of software in terms of maintenance and reuse is important. However, the continual development of software and changes in its size make it difficult for engineers to understand. Recent research studies have sought to solve this problem by extracting the domain topic using various information searching techniques, such as latent semantic indexing (LSI) and latent dirichlet allocation (LDA), with the research on LDA-based techniques being particularly active. However, since the research has used only unstructured information, such as identifiers or notes, without including structured information, such as a method of calling information, it has caused problems in which extracted topics differ according to the program’s characteristics. This paper proposes a method of generating documents and extracting topics using both structured and unstructured information. In addition, indexes are generated based on the frequency of the identifier’s occurrence in the source code, and a system is proposed that extracts an association rule based on the method’s co-occurrence. Moreover, this paper suggests an information retrieval system that can provide highly reliable search results for user queries by combining domain topics, indexes with scores, and association rule information. We also develop the Topic EXtract And Search System (TEXAS2) system for this research and confirm high user satisfaction with the search results to their queries in a performance test.</description><subject>Algorithms</subject><subject>Artificial Intelligence</subject><subject>Computational Intelligence</subject><subject>Data mining</subject><subject>Dirichlet problem</subject><subject>Engineering</subject><subject>Engineers</subject><subject>Information retrieval</subject><subject>Methods</subject><subject>Names</subject><subject>Natural language</subject><subject>Original Research</subject><subject>Performance indices</subject><subject>Performance tests</subject><subject>Queries</subject><subject>Robotics and Automation</subject><subject>Searching</subject><subject>Software</subject><subject>Source code</subject><subject>Unstructured data</subject><subject>User Interfaces and Human Computer Interaction</subject><subject>User satisfaction</subject><issn>1868-5137</issn><issn>1868-5145</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>BENPR</sourceid><recordid>eNp1UE1LxDAQDaLgsu4P8BbwXM0kaZseZfELFjyoFy8hm07Wrt12TVJx_70pFT05l3nM-2B4hJwDuwTGyqsAvMh5xkBlwGSR8SMyA1WoLAeZH_9iUZ6SRQhblkZUAgBm5PXpECLuqOs9xa_ojY1Nt6F1vzNNR2O_bywdwnhqm-6dms60h9CEBGoa0Hj7NnKj22OLn6aL1KGJg8dwRk6caQMufvacvNzePC_vs9Xj3cPyepVZLoBnbp1Lg0o6IQTjTtWMl0IiV1ZKu66LCsGxUliH0haGJVooKLgDMMLCGsWcXEy5e99_DBii3vaDT48GzSshuFISqqSCSWV9H4JHp_e-2Rl_0MD02KKeWtSpRT22qHny8MkTkrbboP9L_t_0DaY9dQc</recordid><startdate>20240201</startdate><enddate>20240201</enddate><creator>Hwang, Sang Won</creator><creator>Lee, Yong Seok</creator><creator>Nam, Young Kwang</creator><general>Springer Berlin Heidelberg</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>8FE</scope><scope>8FG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>P5Z</scope><scope>P62</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><orcidid>https://orcid.org/0000-0001-8666-7479</orcidid></search><sort><creationdate>20240201</creationdate><title>System for extracting domain topic using link analysis and searching for relevant features</title><author>Hwang, Sang Won ; Lee, Yong Seok ; Nam, Young Kwang</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c2312-fb54ae84f33302f8d02734e28c44cbd69e1f073cfe4c6a0d0238162f11a3c1be3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Algorithms</topic><topic>Artificial Intelligence</topic><topic>Computational Intelligence</topic><topic>Data mining</topic><topic>Dirichlet problem</topic><topic>Engineering</topic><topic>Engineers</topic><topic>Information retrieval</topic><topic>Methods</topic><topic>Names</topic><topic>Natural language</topic><topic>Original Research</topic><topic>Performance indices</topic><topic>Performance tests</topic><topic>Queries</topic><topic>Robotics and Automation</topic><topic>Searching</topic><topic>Software</topic><topic>Source code</topic><topic>Unstructured data</topic><topic>User Interfaces and Human Computer Interaction</topic><topic>User satisfaction</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Hwang, Sang Won</creatorcontrib><creatorcontrib>Lee, Yong Seok</creatorcontrib><creatorcontrib>Nam, Young Kwang</creatorcontrib><collection>CrossRef</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><jtitle>Journal of ambient intelligence and humanized computing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Hwang, Sang Won</au><au>Lee, Yong Seok</au><au>Nam, Young Kwang</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>System for extracting domain topic using link analysis and searching for relevant features</atitle><jtitle>Journal of ambient intelligence and humanized computing</jtitle><stitle>J Ambient Intell Human Comput</stitle><date>2024-02-01</date><risdate>2024</risdate><volume>15</volume><issue>2</issue><spage>1429</spage><epage>1441</epage><pages>1429-1441</pages><issn>1868-5137</issn><eissn>1868-5145</eissn><abstract>Understanding the domain topic of software in terms of maintenance and reuse is important. However, the continual development of software and changes in its size make it difficult for engineers to understand. Recent research studies have sought to solve this problem by extracting the domain topic using various information searching techniques, such as latent semantic indexing (LSI) and latent dirichlet allocation (LDA), with the research on LDA-based techniques being particularly active. However, since the research has used only unstructured information, such as identifiers or notes, without including structured information, such as a method of calling information, it has caused problems in which extracted topics differ according to the program’s characteristics. This paper proposes a method of generating documents and extracting topics using both structured and unstructured information. In addition, indexes are generated based on the frequency of the identifier’s occurrence in the source code, and a system is proposed that extracts an association rule based on the method’s co-occurrence. Moreover, this paper suggests an information retrieval system that can provide highly reliable search results for user queries by combining domain topics, indexes with scores, and association rule information. We also develop the Topic EXtract And Search System (TEXAS2) system for this research and confirm high user satisfaction with the search results to their queries in a performance test.</abstract><cop>Berlin/Heidelberg</cop><pub>Springer Berlin Heidelberg</pub><doi>10.1007/s12652-018-1046-2</doi><tpages>13</tpages><orcidid>https://orcid.org/0000-0001-8666-7479</orcidid></addata></record>
fulltext	fulltext
identifier	ISSN: 1868-5137
ispartof	Journal of ambient intelligence and humanized computing, 2024-02, Vol.15 (2), p.1429-1441
issn	1868-5137 1868-5145
language	eng
recordid	cdi_proquest_journals_2933288419
source	SpringerLink Journals - AutoHoldings; ProQuest Central
subjects	Algorithms Artificial Intelligence Computational Intelligence Data mining Dirichlet problem Engineering Engineers Information retrieval Methods Names Natural language Original Research Performance indices Performance tests Queries Robotics and Automation Searching Software Source code Unstructured data User Interfaces and Human Computer Interaction User satisfaction
title	System for extracting domain topic using link analysis and searching for relevant features
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-05T03%3A26%3A39IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=System%20for%20extracting%20domain%20topic%20using%20link%20analysis%20and%20searching%20for%20relevant%20features&rft.jtitle=Journal%20of%20ambient%20intelligence%20and%20humanized%20computing&rft.au=Hwang,%20Sang%20Won&rft.date=2024-02-01&rft.volume=15&rft.issue=2&rft.spage=1429&rft.epage=1441&rft.pages=1429-1441&rft.issn=1868-5137&rft.eissn=1868-5145&rft_id=info:doi/10.1007/s12652-018-1046-2&rft_dat=%3Cproquest_cross%3E2933288419%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2933288419&rft_id=info:pmid/&rfr_iscdi=true