Analyzing Techniques for Duplicate Question Detection on Q&A Websites for Game Developers
Game development is currently the largest industry in the entertainment segment and has a high demand for skilled game developers that can produce high-quality games. To satiate this demand, game developers need resources that can provide them with the knowledge they need to learn and improve their...
Gespeichert in:
Veröffentlicht in: | Empirical software engineering : an international journal 2023-01, Vol.28 (1), p.17, Article 17 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | 1 |
container_start_page | 17 |
container_title | Empirical software engineering : an international journal |
container_volume | 28 |
creator | Kamienski, Arthur Hindle, Abram Bezemer, Cor-Paul |
description | Game development is currently the largest industry in the entertainment segment and has a high demand for skilled game developers that can produce high-quality games. To satiate this demand, game developers need resources that can provide them with the knowledge they need to learn and improve their skills. Question and Answer (Q&A) websites are one of such resources that provide a valuable source of knowledge about game development practices. However, the presence of duplicate questions on Q&A websites hinders their ability to effectively provide information for their users. While several researchers created and analyzed techniques for duplicate question detection on websites such as Stack Overflow, so far no studies have explored how well those techniques work on Q&A websites for game development. With that in mind, in this paper we analyze how we can use pre-trained and unsupervised techniques to detect duplicate questions on Q&A websites focused on game development using data extracted from the Game Development Stack Exchange and Stack Overflow. We also explore how we can leverage a small set of labelled data to improve the performance of those techniques. The pre-trained technique based on MPNet achieved the highest results in identifying duplicate questions about game development, and we could achieve a better performance when combining multiple unsupervised techniques into a single supervised model. Furthermore, the supervised models could identify duplicate questions on websites different from those they were trained on with little to no decrease in performance. Our results lay the groundwork for building better duplicate question detection systems in Q&A websites for game developers and ultimately providing game developers with a more effective Q&A community. |
doi_str_mv | 10.1007/s10664-022-10256-w |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2748040365</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2748040365</sourcerecordid><originalsourceid>FETCH-LOGICAL-c319t-adcb7c5a072ceebd5de257bb7da509ecf5abc5835f560963df4d5d2294787a03</originalsourceid><addsrcrecordid>eNp9kN9LwzAQx4MoOKf_gE8FwbfoJWma9nFsOoWBDAbiU0jT6-zo2pp0jvnXm_0A34SDO-4-3-PuS8gtgwcGoB49gySJKXBOGXCZ0O0ZGTCpBFUJS85DLVJORZhckivvVwCQqVgOyMeoMfXup2qW0QLtZ1N9bdBHZeuiyaarK2t6jOah1VdtE02wR3uoQszvR9E75r7qT4KpWWNAvrFuO3T-mlyUpvZ4c8pDsnh-Woxf6Oxt-joezagVLOupKWyurDSguEXMC1kglyrPVWEkZGhLaXIrUyFLmUCWiKKMA8N5FqtUGRBDcndc27l2f3uvV-3Ghae85ipOIQaRyEDxI2Vd673DUneuWhu30wz03kF9dFAHB_XBQb0NInEU-QA3S3R_q_9R_QIYkXUn</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2748040365</pqid></control><display><type>article</type><title>Analyzing Techniques for Duplicate Question Detection on Q&A Websites for Game Developers</title><source>SpringerLink Journals - AutoHoldings</source><creator>Kamienski, Arthur ; Hindle, Abram ; Bezemer, Cor-Paul</creator><creatorcontrib>Kamienski, Arthur ; Hindle, Abram ; Bezemer, Cor-Paul</creatorcontrib><description><![CDATA[Game development is currently the largest industry in the entertainment segment and has a high demand for skilled game developers that can produce high-quality games. To satiate this demand, game developers need resources that can provide them with the knowledge they need to learn and improve their skills. Question and Answer (Q&A) websites are one of such resources that provide a valuable source of knowledge about game development practices. However, the presence of duplicate questions on Q&A websites hinders their ability to effectively provide information for their users. While several researchers created and analyzed techniques for duplicate question detection on websites such as Stack Overflow, so far no studies have explored how well those techniques work on Q&A websites for game development. With that in mind, in this paper we analyze how we can use pre-trained and unsupervised techniques to detect duplicate questions on Q&A websites focused on game development using data extracted from the Game Development Stack Exchange and Stack Overflow. We also explore how we can leverage a small set of labelled data to improve the performance of those techniques. The pre-trained technique based on MPNet achieved the highest results in identifying duplicate questions about game development, and we could achieve a better performance when combining multiple unsupervised techniques into a single supervised model. Furthermore, the supervised models could identify duplicate questions on websites different from those they were trained on with little to no decrease in performance. Our results lay the groundwork for building better duplicate question detection systems in Q&A websites for game developers and ultimately providing game developers with a more effective Q&A community.]]></description><identifier>ISSN: 1382-3256</identifier><identifier>EISSN: 1573-7616</identifier><identifier>DOI: 10.1007/s10664-022-10256-w</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Compilers ; Computer & video games ; Computer Science ; Games ; Interpreters ; Performance enhancement ; Programming Languages ; Questions ; Reproduction (copying) ; Software engineering ; Software Engineering/Programming and Operating Systems ; Websites</subject><ispartof>Empirical software engineering : an international journal, 2023-01, Vol.28 (1), p.17, Article 17</ispartof><rights>The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c319t-adcb7c5a072ceebd5de257bb7da509ecf5abc5835f560963df4d5d2294787a03</citedby><cites>FETCH-LOGICAL-c319t-adcb7c5a072ceebd5de257bb7da509ecf5abc5835f560963df4d5d2294787a03</cites><orcidid>0000-0003-3851-8262</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s10664-022-10256-w$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s10664-022-10256-w$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,780,784,27924,27925,41488,42557,51319</link.rule.ids></links><search><creatorcontrib>Kamienski, Arthur</creatorcontrib><creatorcontrib>Hindle, Abram</creatorcontrib><creatorcontrib>Bezemer, Cor-Paul</creatorcontrib><title>Analyzing Techniques for Duplicate Question Detection on Q&A Websites for Game Developers</title><title>Empirical software engineering : an international journal</title><addtitle>Empir Software Eng</addtitle><description><![CDATA[Game development is currently the largest industry in the entertainment segment and has a high demand for skilled game developers that can produce high-quality games. To satiate this demand, game developers need resources that can provide them with the knowledge they need to learn and improve their skills. Question and Answer (Q&A) websites are one of such resources that provide a valuable source of knowledge about game development practices. However, the presence of duplicate questions on Q&A websites hinders their ability to effectively provide information for their users. While several researchers created and analyzed techniques for duplicate question detection on websites such as Stack Overflow, so far no studies have explored how well those techniques work on Q&A websites for game development. With that in mind, in this paper we analyze how we can use pre-trained and unsupervised techniques to detect duplicate questions on Q&A websites focused on game development using data extracted from the Game Development Stack Exchange and Stack Overflow. We also explore how we can leverage a small set of labelled data to improve the performance of those techniques. The pre-trained technique based on MPNet achieved the highest results in identifying duplicate questions about game development, and we could achieve a better performance when combining multiple unsupervised techniques into a single supervised model. Furthermore, the supervised models could identify duplicate questions on websites different from those they were trained on with little to no decrease in performance. Our results lay the groundwork for building better duplicate question detection systems in Q&A websites for game developers and ultimately providing game developers with a more effective Q&A community.]]></description><subject>Compilers</subject><subject>Computer & video games</subject><subject>Computer Science</subject><subject>Games</subject><subject>Interpreters</subject><subject>Performance enhancement</subject><subject>Programming Languages</subject><subject>Questions</subject><subject>Reproduction (copying)</subject><subject>Software engineering</subject><subject>Software Engineering/Programming and Operating Systems</subject><subject>Websites</subject><issn>1382-3256</issn><issn>1573-7616</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>AFKRA</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNp9kN9LwzAQx4MoOKf_gE8FwbfoJWma9nFsOoWBDAbiU0jT6-zo2pp0jvnXm_0A34SDO-4-3-PuS8gtgwcGoB49gySJKXBOGXCZ0O0ZGTCpBFUJS85DLVJORZhckivvVwCQqVgOyMeoMfXup2qW0QLtZ1N9bdBHZeuiyaarK2t6jOah1VdtE02wR3uoQszvR9E75r7qT4KpWWNAvrFuO3T-mlyUpvZ4c8pDsnh-Woxf6Oxt-joezagVLOupKWyurDSguEXMC1kglyrPVWEkZGhLaXIrUyFLmUCWiKKMA8N5FqtUGRBDcndc27l2f3uvV-3Ghae85ipOIQaRyEDxI2Vd673DUneuWhu30wz03kF9dFAHB_XBQb0NInEU-QA3S3R_q_9R_QIYkXUn</recordid><startdate>20230101</startdate><enddate>20230101</enddate><creator>Kamienski, Arthur</creator><creator>Hindle, Abram</creator><creator>Bezemer, Cor-Paul</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>L6V</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>M7S</scope><scope>P5Z</scope><scope>P62</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope><scope>S0W</scope><orcidid>https://orcid.org/0000-0003-3851-8262</orcidid></search><sort><creationdate>20230101</creationdate><title>Analyzing Techniques for Duplicate Question Detection on Q&A Websites for Game Developers</title><author>Kamienski, Arthur ; Hindle, Abram ; Bezemer, Cor-Paul</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c319t-adcb7c5a072ceebd5de257bb7da509ecf5abc5835f560963df4d5d2294787a03</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Compilers</topic><topic>Computer & video games</topic><topic>Computer Science</topic><topic>Games</topic><topic>Interpreters</topic><topic>Performance enhancement</topic><topic>Programming Languages</topic><topic>Questions</topic><topic>Reproduction (copying)</topic><topic>Software engineering</topic><topic>Software Engineering/Programming and Operating Systems</topic><topic>Websites</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Kamienski, Arthur</creatorcontrib><creatorcontrib>Hindle, Abram</creatorcontrib><creatorcontrib>Bezemer, Cor-Paul</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Engineering Database</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection><collection>DELNET Engineering & Technology Collection</collection><jtitle>Empirical software engineering : an international journal</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Kamienski, Arthur</au><au>Hindle, Abram</au><au>Bezemer, Cor-Paul</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Analyzing Techniques for Duplicate Question Detection on Q&A Websites for Game Developers</atitle><jtitle>Empirical software engineering : an international journal</jtitle><stitle>Empir Software Eng</stitle><date>2023-01-01</date><risdate>2023</risdate><volume>28</volume><issue>1</issue><spage>17</spage><pages>17-</pages><artnum>17</artnum><issn>1382-3256</issn><eissn>1573-7616</eissn><abstract><![CDATA[Game development is currently the largest industry in the entertainment segment and has a high demand for skilled game developers that can produce high-quality games. To satiate this demand, game developers need resources that can provide them with the knowledge they need to learn and improve their skills. Question and Answer (Q&A) websites are one of such resources that provide a valuable source of knowledge about game development practices. However, the presence of duplicate questions on Q&A websites hinders their ability to effectively provide information for their users. While several researchers created and analyzed techniques for duplicate question detection on websites such as Stack Overflow, so far no studies have explored how well those techniques work on Q&A websites for game development. With that in mind, in this paper we analyze how we can use pre-trained and unsupervised techniques to detect duplicate questions on Q&A websites focused on game development using data extracted from the Game Development Stack Exchange and Stack Overflow. We also explore how we can leverage a small set of labelled data to improve the performance of those techniques. The pre-trained technique based on MPNet achieved the highest results in identifying duplicate questions about game development, and we could achieve a better performance when combining multiple unsupervised techniques into a single supervised model. Furthermore, the supervised models could identify duplicate questions on websites different from those they were trained on with little to no decrease in performance. Our results lay the groundwork for building better duplicate question detection systems in Q&A websites for game developers and ultimately providing game developers with a more effective Q&A community.]]></abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s10664-022-10256-w</doi><orcidid>https://orcid.org/0000-0003-3851-8262</orcidid></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1382-3256 |
ispartof | Empirical software engineering : an international journal, 2023-01, Vol.28 (1), p.17, Article 17 |
issn | 1382-3256 1573-7616 |
language | eng |
recordid | cdi_proquest_journals_2748040365 |
source | SpringerLink Journals - AutoHoldings |
subjects | Compilers Computer & video games Computer Science Games Interpreters Performance enhancement Programming Languages Questions Reproduction (copying) Software engineering Software Engineering/Programming and Operating Systems Websites |
title | Analyzing Techniques for Duplicate Question Detection on Q&A Websites for Game Developers |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-26T14%3A39%3A40IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Analyzing%20Techniques%20for%20Duplicate%20Question%20Detection%20on%20Q&A%20Websites%20for%20Game%20Developers&rft.jtitle=Empirical%20software%20engineering%20:%20an%20international%20journal&rft.au=Kamienski,%20Arthur&rft.date=2023-01-01&rft.volume=28&rft.issue=1&rft.spage=17&rft.pages=17-&rft.artnum=17&rft.issn=1382-3256&rft.eissn=1573-7616&rft_id=info:doi/10.1007/s10664-022-10256-w&rft_dat=%3Cproquest_cross%3E2748040365%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2748040365&rft_id=info:pmid/&rfr_iscdi=true |