System for extracting domain topic using link analysis and searching for relevant features
Understanding the domain topic of software in terms of maintenance and reuse is important. However, the continual development of software and changes in its size make it difficult for engineers to understand. Recent research studies have sought to solve this problem by extracting the domain topic us...
Gespeichert in:
Veröffentlicht in: | Journal of ambient intelligence and humanized computing 2024-02, Vol.15 (2), p.1429-1441 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 1441 |
---|---|
container_issue | 2 |
container_start_page | 1429 |
container_title | Journal of ambient intelligence and humanized computing |
container_volume | 15 |
creator | Hwang, Sang Won Lee, Yong Seok Nam, Young Kwang |
description | Understanding the domain topic of software in terms of maintenance and reuse is important. However, the continual development of software and changes in its size make it difficult for engineers to understand. Recent research studies have sought to solve this problem by extracting the domain topic using various information searching techniques, such as latent semantic indexing (LSI) and latent dirichlet allocation (LDA), with the research on LDA-based techniques being particularly active. However, since the research has used only unstructured information, such as identifiers or notes, without including structured information, such as a method of calling information, it has caused problems in which extracted topics differ according to the program’s characteristics. This paper proposes a method of generating documents and extracting topics using both structured and unstructured information. In addition, indexes are generated based on the frequency of the identifier’s occurrence in the source code, and a system is proposed that extracts an association rule based on the method’s co-occurrence. Moreover, this paper suggests an information retrieval system that can provide highly reliable search results for user queries by combining domain topics, indexes with scores, and association rule information. We also develop the Topic EXtract And Search System (TEXAS2) system for this research and confirm high user satisfaction with the search results to their queries in a performance test. |
doi_str_mv | 10.1007/s12652-018-1046-2 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2933288419</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2933288419</sourcerecordid><originalsourceid>FETCH-LOGICAL-c2312-fb54ae84f33302f8d02734e28c44cbd69e1f073cfe4c6a0d0238162f11a3c1be3</originalsourceid><addsrcrecordid>eNp1UE1LxDAQDaLgsu4P8BbwXM0kaZseZfELFjyoFy8hm07Wrt12TVJx_70pFT05l3nM-2B4hJwDuwTGyqsAvMh5xkBlwGSR8SMyA1WoLAeZH_9iUZ6SRQhblkZUAgBm5PXpECLuqOs9xa_ojY1Nt6F1vzNNR2O_bywdwnhqm-6dms60h9CEBGoa0Hj7NnKj22OLn6aL1KGJg8dwRk6caQMufvacvNzePC_vs9Xj3cPyepVZLoBnbp1Lg0o6IQTjTtWMl0IiV1ZKu66LCsGxUliH0haGJVooKLgDMMLCGsWcXEy5e99_DBii3vaDT48GzSshuFISqqSCSWV9H4JHp_e-2Rl_0MD02KKeWtSpRT22qHny8MkTkrbboP9L_t_0DaY9dQc</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2933288419</pqid></control><display><type>article</type><title>System for extracting domain topic using link analysis and searching for relevant features</title><source>SpringerLink Journals - AutoHoldings</source><source>ProQuest Central</source><creator>Hwang, Sang Won ; Lee, Yong Seok ; Nam, Young Kwang</creator><creatorcontrib>Hwang, Sang Won ; Lee, Yong Seok ; Nam, Young Kwang</creatorcontrib><description>Understanding the domain topic of software in terms of maintenance and reuse is important. However, the continual development of software and changes in its size make it difficult for engineers to understand. Recent research studies have sought to solve this problem by extracting the domain topic using various information searching techniques, such as latent semantic indexing (LSI) and latent dirichlet allocation (LDA), with the research on LDA-based techniques being particularly active. However, since the research has used only unstructured information, such as identifiers or notes, without including structured information, such as a method of calling information, it has caused problems in which extracted topics differ according to the program’s characteristics. This paper proposes a method of generating documents and extracting topics using both structured and unstructured information. In addition, indexes are generated based on the frequency of the identifier’s occurrence in the source code, and a system is proposed that extracts an association rule based on the method’s co-occurrence. Moreover, this paper suggests an information retrieval system that can provide highly reliable search results for user queries by combining domain topics, indexes with scores, and association rule information. We also develop the Topic EXtract And Search System (TEXAS2) system for this research and confirm high user satisfaction with the search results to their queries in a performance test.</description><identifier>ISSN: 1868-5137</identifier><identifier>EISSN: 1868-5145</identifier><identifier>DOI: 10.1007/s12652-018-1046-2</identifier><language>eng</language><publisher>Berlin/Heidelberg: Springer Berlin Heidelberg</publisher><subject>Algorithms ; Artificial Intelligence ; Computational Intelligence ; Data mining ; Dirichlet problem ; Engineering ; Engineers ; Information retrieval ; Methods ; Names ; Natural language ; Original Research ; Performance indices ; Performance tests ; Queries ; Robotics and Automation ; Searching ; Software ; Source code ; Unstructured data ; User Interfaces and Human Computer Interaction ; User satisfaction</subject><ispartof>Journal of ambient intelligence and humanized computing, 2024-02, Vol.15 (2), p.1429-1441</ispartof><rights>Springer-Verlag GmbH Germany, part of Springer Nature 2018</rights><rights>Springer-Verlag GmbH Germany, part of Springer Nature 2018.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c2312-fb54ae84f33302f8d02734e28c44cbd69e1f073cfe4c6a0d0238162f11a3c1be3</citedby><cites>FETCH-LOGICAL-c2312-fb54ae84f33302f8d02734e28c44cbd69e1f073cfe4c6a0d0238162f11a3c1be3</cites><orcidid>0000-0001-8666-7479</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s12652-018-1046-2$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/2933288419?pq-origsite=primo$$EHTML$$P50$$Gproquest$$H</linktohtml><link.rule.ids>314,776,780,21367,27901,27902,33721,41464,42533,43781,51294</link.rule.ids></links><search><creatorcontrib>Hwang, Sang Won</creatorcontrib><creatorcontrib>Lee, Yong Seok</creatorcontrib><creatorcontrib>Nam, Young Kwang</creatorcontrib><title>System for extracting domain topic using link analysis and searching for relevant features</title><title>Journal of ambient intelligence and humanized computing</title><addtitle>J Ambient Intell Human Comput</addtitle><description>Understanding the domain topic of software in terms of maintenance and reuse is important. However, the continual development of software and changes in its size make it difficult for engineers to understand. Recent research studies have sought to solve this problem by extracting the domain topic using various information searching techniques, such as latent semantic indexing (LSI) and latent dirichlet allocation (LDA), with the research on LDA-based techniques being particularly active. However, since the research has used only unstructured information, such as identifiers or notes, without including structured information, such as a method of calling information, it has caused problems in which extracted topics differ according to the program’s characteristics. This paper proposes a method of generating documents and extracting topics using both structured and unstructured information. In addition, indexes are generated based on the frequency of the identifier’s occurrence in the source code, and a system is proposed that extracts an association rule based on the method’s co-occurrence. Moreover, this paper suggests an information retrieval system that can provide highly reliable search results for user queries by combining domain topics, indexes with scores, and association rule information. We also develop the Topic EXtract And Search System (TEXAS2) system for this research and confirm high user satisfaction with the search results to their queries in a performance test.</description><subject>Algorithms</subject><subject>Artificial Intelligence</subject><subject>Computational Intelligence</subject><subject>Data mining</subject><subject>Dirichlet problem</subject><subject>Engineering</subject><subject>Engineers</subject><subject>Information retrieval</subject><subject>Methods</subject><subject>Names</subject><subject>Natural language</subject><subject>Original Research</subject><subject>Performance indices</subject><subject>Performance tests</subject><subject>Queries</subject><subject>Robotics and Automation</subject><subject>Searching</subject><subject>Software</subject><subject>Source code</subject><subject>Unstructured data</subject><subject>User Interfaces and Human Computer Interaction</subject><subject>User satisfaction</subject><issn>1868-5137</issn><issn>1868-5145</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>BENPR</sourceid><recordid>eNp1UE1LxDAQDaLgsu4P8BbwXM0kaZseZfELFjyoFy8hm07Wrt12TVJx_70pFT05l3nM-2B4hJwDuwTGyqsAvMh5xkBlwGSR8SMyA1WoLAeZH_9iUZ6SRQhblkZUAgBm5PXpECLuqOs9xa_ojY1Nt6F1vzNNR2O_bywdwnhqm-6dms60h9CEBGoa0Hj7NnKj22OLn6aL1KGJg8dwRk6caQMufvacvNzePC_vs9Xj3cPyepVZLoBnbp1Lg0o6IQTjTtWMl0IiV1ZKu66LCsGxUliH0haGJVooKLgDMMLCGsWcXEy5e99_DBii3vaDT48GzSshuFISqqSCSWV9H4JHp_e-2Rl_0MD02KKeWtSpRT22qHny8MkTkrbboP9L_t_0DaY9dQc</recordid><startdate>20240201</startdate><enddate>20240201</enddate><creator>Hwang, Sang Won</creator><creator>Lee, Yong Seok</creator><creator>Nam, Young Kwang</creator><general>Springer Berlin Heidelberg</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>8FE</scope><scope>8FG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>P5Z</scope><scope>P62</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><orcidid>https://orcid.org/0000-0001-8666-7479</orcidid></search><sort><creationdate>20240201</creationdate><title>System for extracting domain topic using link analysis and searching for relevant features</title><author>Hwang, Sang Won ; Lee, Yong Seok ; Nam, Young Kwang</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c2312-fb54ae84f33302f8d02734e28c44cbd69e1f073cfe4c6a0d0238162f11a3c1be3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Algorithms</topic><topic>Artificial Intelligence</topic><topic>Computational Intelligence</topic><topic>Data mining</topic><topic>Dirichlet problem</topic><topic>Engineering</topic><topic>Engineers</topic><topic>Information retrieval</topic><topic>Methods</topic><topic>Names</topic><topic>Natural language</topic><topic>Original Research</topic><topic>Performance indices</topic><topic>Performance tests</topic><topic>Queries</topic><topic>Robotics and Automation</topic><topic>Searching</topic><topic>Software</topic><topic>Source code</topic><topic>Unstructured data</topic><topic>User Interfaces and Human Computer Interaction</topic><topic>User satisfaction</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Hwang, Sang Won</creatorcontrib><creatorcontrib>Lee, Yong Seok</creatorcontrib><creatorcontrib>Nam, Young Kwang</creatorcontrib><collection>CrossRef</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><jtitle>Journal of ambient intelligence and humanized computing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Hwang, Sang Won</au><au>Lee, Yong Seok</au><au>Nam, Young Kwang</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>System for extracting domain topic using link analysis and searching for relevant features</atitle><jtitle>Journal of ambient intelligence and humanized computing</jtitle><stitle>J Ambient Intell Human Comput</stitle><date>2024-02-01</date><risdate>2024</risdate><volume>15</volume><issue>2</issue><spage>1429</spage><epage>1441</epage><pages>1429-1441</pages><issn>1868-5137</issn><eissn>1868-5145</eissn><abstract>Understanding the domain topic of software in terms of maintenance and reuse is important. However, the continual development of software and changes in its size make it difficult for engineers to understand. Recent research studies have sought to solve this problem by extracting the domain topic using various information searching techniques, such as latent semantic indexing (LSI) and latent dirichlet allocation (LDA), with the research on LDA-based techniques being particularly active. However, since the research has used only unstructured information, such as identifiers or notes, without including structured information, such as a method of calling information, it has caused problems in which extracted topics differ according to the program’s characteristics. This paper proposes a method of generating documents and extracting topics using both structured and unstructured information. In addition, indexes are generated based on the frequency of the identifier’s occurrence in the source code, and a system is proposed that extracts an association rule based on the method’s co-occurrence. Moreover, this paper suggests an information retrieval system that can provide highly reliable search results for user queries by combining domain topics, indexes with scores, and association rule information. We also develop the Topic EXtract And Search System (TEXAS2) system for this research and confirm high user satisfaction with the search results to their queries in a performance test.</abstract><cop>Berlin/Heidelberg</cop><pub>Springer Berlin Heidelberg</pub><doi>10.1007/s12652-018-1046-2</doi><tpages>13</tpages><orcidid>https://orcid.org/0000-0001-8666-7479</orcidid></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1868-5137 |
ispartof | Journal of ambient intelligence and humanized computing, 2024-02, Vol.15 (2), p.1429-1441 |
issn | 1868-5137 1868-5145 |
language | eng |
recordid | cdi_proquest_journals_2933288419 |
source | SpringerLink Journals - AutoHoldings; ProQuest Central |
subjects | Algorithms Artificial Intelligence Computational Intelligence Data mining Dirichlet problem Engineering Engineers Information retrieval Methods Names Natural language Original Research Performance indices Performance tests Queries Robotics and Automation Searching Software Source code Unstructured data User Interfaces and Human Computer Interaction User satisfaction |
title | System for extracting domain topic using link analysis and searching for relevant features |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-05T03%3A26%3A39IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=System%20for%20extracting%20domain%20topic%20using%20link%20analysis%20and%20searching%20for%20relevant%20features&rft.jtitle=Journal%20of%20ambient%20intelligence%20and%20humanized%20computing&rft.au=Hwang,%20Sang%20Won&rft.date=2024-02-01&rft.volume=15&rft.issue=2&rft.spage=1429&rft.epage=1441&rft.pages=1429-1441&rft.issn=1868-5137&rft.eissn=1868-5145&rft_id=info:doi/10.1007/s12652-018-1046-2&rft_dat=%3Cproquest_cross%3E2933288419%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2933288419&rft_id=info:pmid/&rfr_iscdi=true |