Developing a corpus of plagiarised short answers

Plagiarism is widely acknowledged to be a significant and increasing problem for higher education institutions (McCabe 2005; Judge 2008). A wide range of solutions, including several commercial systems, have been proposed to assist the educator in the task of identifying plagiarised work, or even to...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Language Resources and Evaluation 2011-03, Vol.45 (1), p.5-24
Hauptverfasser:	Clough, Paul, Stevenson, Mark
Format:	Artikel
Sprache:	eng
Schlagworte:	Authoring Authorship Computational Linguistics Computer Science Construction Education Higher education Higher education institutions Identification Information retrieval Language and Literature Legal proceedings Linguistics Natural resources Object oriented programming Paraphrase Plagiarism Question answer sequences Search engines Simulation Social Sciences Students Tasks Text analysis Wikipedia
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	24
container_issue	1
container_start_page	5
container_title	Language Resources and Evaluation
container_volume	45
creator	Clough, Paul Stevenson, Mark
description	Plagiarism is widely acknowledged to be a significant and increasing problem for higher education institutions (McCabe 2005; Judge 2008). A wide range of solutions, including several commercial systems, have been proposed to assist the educator in the task of identifying plagiarised work, or even to detect them automatically. Direct comparison of these systems is made difficult by the problems in obtaining genuine examples of plagiarised student work. We describe our initial experiences with constructing a corpus consisting of answers to short questions in which plagiarism has been simulated. This corpus is designed to represent types of plagiarism that are not included in existing corpora and will be a useful addition to the set of resources available for the evaluation of plagiarism detection systems.
doi_str_mv	10.1007/s10579-009-9112-1
format	Article
fullrecord	<record><control><sourceid>jstor_proqu</sourceid><recordid>TN_cdi_proquest_miscellaneous_914765118</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><jstor_id>41486025</jstor_id><sourcerecordid>41486025</sourcerecordid><originalsourceid>FETCH-LOGICAL-c434t-eb97709b72935187d32e6b5ab8e39e5bfe5f7b30be864f9e62e1b93a18fad5e63</originalsourceid><addsrcrecordid>eNqFkc1LxDAQxYsouK7-AR6E4kUv0Uw-muQo6ycIXhS8hbQ7Xbt0m5q0iv-9XSsKHtbTDMzvvcfwkuQQ6BlQqs4jUKkModQQA8AIbCUTkIoRLYBtf-2CUEafd5O9GJeUCiaUniT0Et-w9m3VLFKXFj60fUx9mba1W1QuVBHnaXzxoUtdE98xxP1kp3R1xIPvOU2erq8eZ7fk_uHmbnZxTwrBRUcwN0pRkytmuASt5pxhlkuXa-QGZV6iLFXOaY46E6XBjCHkhjvQpZtLzPg0ORl92-Bfe4ydXVWxwLp2Dfo-WgNCZRJA_0tqrYThKjMDebqRhEwBB6mlGNDjP-jS96EZPrbrs2GSrpNhhIrgYwxY2jZUKxc-LFC7rsWOtdihFruuxcKgYaMmDmyzwPBrvEl0NIqWsfPhJ0WA0Bllkn8CcxeXaw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>854392508</pqid></control><display><type>article</type><title>Developing a corpus of plagiarised short answers</title><source>SpringerLink Journals</source><source>Jstor Complete Legacy</source><creator>Clough, Paul ; Stevenson, Mark</creator><creatorcontrib>Clough, Paul ; Stevenson, Mark</creatorcontrib><description>Plagiarism is widely acknowledged to be a significant and increasing problem for higher education institutions (McCabe 2005; Judge 2008). A wide range of solutions, including several commercial systems, have been proposed to assist the educator in the task of identifying plagiarised work, or even to detect them automatically. Direct comparison of these systems is made difficult by the problems in obtaining genuine examples of plagiarised student work. We describe our initial experiences with constructing a corpus consisting of answers to short questions in which plagiarism has been simulated. This corpus is designed to represent types of plagiarism that are not included in existing corpora and will be a useful addition to the set of resources available for the evaluation of plagiarism detection systems.</description><identifier>ISSN: 1574-020X</identifier><identifier>EISSN: 1572-8412</identifier><identifier>EISSN: 1574-0218</identifier><identifier>DOI: 10.1007/s10579-009-9112-1</identifier><identifier>CODEN: COHUAD</identifier><language>eng</language><publisher>Dordrecht: Springer</publisher><subject>Authoring ; Authorship ; Computational Linguistics ; Computer Science ; Construction ; Education ; Higher education ; Higher education institutions ; Identification ; Information retrieval ; Language and Literature ; Legal proceedings ; Linguistics ; Natural resources ; Object oriented programming ; Paraphrase ; Plagiarism ; Question answer sequences ; Search engines ; Simulation ; Social Sciences ; Students ; Tasks ; Text analysis ; Wikipedia</subject><ispartof>Language Resources and Evaluation, 2011-03, Vol.45 (1), p.5-24</ispartof><rights>2011 Springer</rights><rights>Springer Science+Business Media B.V. 2010</rights><rights>Springer Science+Business Media B.V. 2011</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c434t-eb97709b72935187d32e6b5ab8e39e5bfe5f7b30be864f9e62e1b93a18fad5e63</citedby><cites>FETCH-LOGICAL-c434t-eb97709b72935187d32e6b5ab8e39e5bfe5f7b30be864f9e62e1b93a18fad5e63</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.jstor.org/stable/pdf/41486025$$EPDF$$P50$$Gjstor$$H</linktopdf><linktohtml>$$Uhttps://www.jstor.org/stable/41486025$$EHTML$$P50$$Gjstor$$H</linktohtml><link.rule.ids>314,776,780,799,27903,27904,41467,42536,51298,57996,58229</link.rule.ids></links><search><creatorcontrib>Clough, Paul</creatorcontrib><creatorcontrib>Stevenson, Mark</creatorcontrib><title>Developing a corpus of plagiarised short answers</title><title>Language Resources and Evaluation</title><addtitle>Lang Resources & Evaluation</addtitle><description>Plagiarism is widely acknowledged to be a significant and increasing problem for higher education institutions (McCabe 2005; Judge 2008). A wide range of solutions, including several commercial systems, have been proposed to assist the educator in the task of identifying plagiarised work, or even to detect them automatically. Direct comparison of these systems is made difficult by the problems in obtaining genuine examples of plagiarised student work. We describe our initial experiences with constructing a corpus consisting of answers to short questions in which plagiarism has been simulated. This corpus is designed to represent types of plagiarism that are not included in existing corpora and will be a useful addition to the set of resources available for the evaluation of plagiarism detection systems.</description><subject>Authoring</subject><subject>Authorship</subject><subject>Computational Linguistics</subject><subject>Computer Science</subject><subject>Construction</subject><subject>Education</subject><subject>Higher education</subject><subject>Higher education institutions</subject><subject>Identification</subject><subject>Information retrieval</subject><subject>Language and Literature</subject><subject>Legal proceedings</subject><subject>Linguistics</subject><subject>Natural resources</subject><subject>Object oriented programming</subject><subject>Paraphrase</subject><subject>Plagiarism</subject><subject>Question answer sequences</subject><subject>Search engines</subject><subject>Simulation</subject><subject>Social Sciences</subject><subject>Students</subject><subject>Tasks</subject><subject>Text analysis</subject><subject>Wikipedia</subject><issn>1574-020X</issn><issn>1572-8412</issn><issn>1574-0218</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2011</creationdate><recordtype>article</recordtype><sourceid>8G5</sourceid><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AIMQZ</sourceid><sourceid>AVQMV</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><sourceid>GUQSH</sourceid><sourceid>K50</sourceid><sourceid>M1D</sourceid><sourceid>M2O</sourceid><recordid>eNqFkc1LxDAQxYsouK7-AR6E4kUv0Uw-muQo6ycIXhS8hbQ7Xbt0m5q0iv-9XSsKHtbTDMzvvcfwkuQQ6BlQqs4jUKkModQQA8AIbCUTkIoRLYBtf-2CUEafd5O9GJeUCiaUniT0Et-w9m3VLFKXFj60fUx9mba1W1QuVBHnaXzxoUtdE98xxP1kp3R1xIPvOU2erq8eZ7fk_uHmbnZxTwrBRUcwN0pRkytmuASt5pxhlkuXa-QGZV6iLFXOaY46E6XBjCHkhjvQpZtLzPg0ORl92-Bfe4ydXVWxwLp2Dfo-WgNCZRJA_0tqrYThKjMDebqRhEwBB6mlGNDjP-jS96EZPrbrs2GSrpNhhIrgYwxY2jZUKxc-LFC7rsWOtdihFruuxcKgYaMmDmyzwPBrvEl0NIqWsfPhJ0WA0Bllkn8CcxeXaw</recordid><startdate>20110301</startdate><enddate>20110301</enddate><creator>Clough, Paul</creator><creator>Stevenson, Mark</creator><general>Springer</general><general>Springer Netherlands</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7SC</scope><scope>7T9</scope><scope>7XB</scope><scope>8AL</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FK</scope><scope>8G5</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AIMQZ</scope><scope>ALSLI</scope><scope>ARAPS</scope><scope>AVQMV</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>CPGLG</scope><scope>CRLPW</scope><scope>DWQXO</scope><scope>GB0</scope><scope>GNUQQ</scope><scope>GUQSH</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K50</scope><scope>K7-</scope><scope>L7M</scope><scope>LIQON</scope><scope>L~C</scope><scope>L~D</scope><scope>M0N</scope><scope>M1D</scope><scope>M2O</scope><scope>MBDVC</scope><scope>P5Z</scope><scope>P62</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>Q9U</scope><scope>E3H</scope><scope>F2A</scope></search><sort><creationdate>20110301</creationdate><title>Developing a corpus of plagiarised short answers</title><author>Clough, Paul ; Stevenson, Mark</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c434t-eb97709b72935187d32e6b5ab8e39e5bfe5f7b30be864f9e62e1b93a18fad5e63</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2011</creationdate><topic>Authoring</topic><topic>Authorship</topic><topic>Computational Linguistics</topic><topic>Computer Science</topic><topic>Construction</topic><topic>Education</topic><topic>Higher education</topic><topic>Higher education institutions</topic><topic>Identification</topic><topic>Information retrieval</topic><topic>Language and Literature</topic><topic>Legal proceedings</topic><topic>Linguistics</topic><topic>Natural resources</topic><topic>Object oriented programming</topic><topic>Paraphrase</topic><topic>Plagiarism</topic><topic>Question answer sequences</topic><topic>Search engines</topic><topic>Simulation</topic><topic>Social Sciences</topic><topic>Students</topic><topic>Tasks</topic><topic>Text analysis</topic><topic>Wikipedia</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Clough, Paul</creatorcontrib><creatorcontrib>Stevenson, Mark</creatorcontrib><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Computer and Information Systems Abstracts</collection><collection>Linguistics and Language Behavior Abstracts (LLBA)</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Computing Database (Alumni Edition)</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>Research Library (Alumni Edition)</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest One Literature</collection><collection>Social Science Premium Collection</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>Arts Premium Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>Linguistics Collection</collection><collection>Linguistics Database</collection><collection>ProQuest Central Korea</collection><collection>DELNET Social Sciences & Humanities Collection</collection><collection>ProQuest Central Student</collection><collection>Research Library Prep</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Art, Design & Architecture Collection</collection><collection>Computer Science Database</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>ProQuest One Literature - U.S. Customers Only</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Computing Database</collection><collection>Arts & Humanities Database</collection><collection>Research Library</collection><collection>Research Library (Corporate)</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>ProQuest Central Basic</collection><collection>Library & Information Sciences Abstracts (LISA)</collection><collection>Library & Information Science Abstracts (LISA)</collection><jtitle>Language Resources and Evaluation</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Clough, Paul</au><au>Stevenson, Mark</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Developing a corpus of plagiarised short answers</atitle><jtitle>Language Resources and Evaluation</jtitle><stitle>Lang Resources & Evaluation</stitle><date>2011-03-01</date><risdate>2011</risdate><volume>45</volume><issue>1</issue><spage>5</spage><epage>24</epage><pages>5-24</pages><issn>1574-020X</issn><eissn>1572-8412</eissn><eissn>1574-0218</eissn><coden>COHUAD</coden><abstract>Plagiarism is widely acknowledged to be a significant and increasing problem for higher education institutions (McCabe 2005; Judge 2008). A wide range of solutions, including several commercial systems, have been proposed to assist the educator in the task of identifying plagiarised work, or even to detect them automatically. Direct comparison of these systems is made difficult by the problems in obtaining genuine examples of plagiarised student work. We describe our initial experiences with constructing a corpus consisting of answers to short questions in which plagiarism has been simulated. This corpus is designed to represent types of plagiarism that are not included in existing corpora and will be a useful addition to the set of resources available for the evaluation of plagiarism detection systems.</abstract><cop>Dordrecht</cop><pub>Springer</pub><doi>10.1007/s10579-009-9112-1</doi><tpages>20</tpages></addata></record>
fulltext	fulltext
identifier	ISSN: 1574-020X
ispartof	Language Resources and Evaluation, 2011-03, Vol.45 (1), p.5-24
issn	1574-020X 1572-8412 1574-0218
language	eng
recordid	cdi_proquest_miscellaneous_914765118
source	SpringerLink Journals; Jstor Complete Legacy
subjects	Authoring Authorship Computational Linguistics Computer Science Construction Education Higher education Higher education institutions Identification Information retrieval Language and Literature Legal proceedings Linguistics Natural resources Object oriented programming Paraphrase Plagiarism Question answer sequences Search engines Simulation Social Sciences Students Tasks Text analysis Wikipedia
title	Developing a corpus of plagiarised short answers
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-23T01%3A56%3A32IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-jstor_proqu&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Developing%20a%20corpus%20of%20plagiarised%20short%20answers&rft.jtitle=Language%20Resources%20and%20Evaluation&rft.au=Clough,%20Paul&rft.date=2011-03-01&rft.volume=45&rft.issue=1&rft.spage=5&rft.epage=24&rft.pages=5-24&rft.issn=1574-020X&rft.eissn=1572-8412&rft.coden=COHUAD&rft_id=info:doi/10.1007/s10579-009-9112-1&rft_dat=%3Cjstor_proqu%3E41486025%3C/jstor_proqu%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=854392508&rft_id=info:pmid/&rft_jstor_id=41486025&rfr_iscdi=true