Copy Detection in Chinese Documents Using Ferret

The Ferret copy detector has been used since 2001 to find plagiarism in large collections of students' coursework in English. This article reports on extending its application to Chinese, with experiments on corpora of coursework collected from two Chinese universities. Our experiments show tha...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Language Resources and Evaluation 2006-12, Vol.40 (3/4), p.357-365
Hauptverfasser: Jun Peng Bao, Lyon, Caroline, Lane, Peter C. R.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The Ferret copy detector has been used since 2001 to find plagiarism in large collections of students' coursework in English. This article reports on extending its application to Chinese, with experiments on corpora of coursework collected from two Chinese universities. Our experiments show that Ferret can find both artificially constructed plagiarism and actually occurring, previously undetected plagiarism. We discuss issues of representation, focus on the effectiveness of a sub-symbolic approach, and show that Ferret does not need to find word boundaries first.
ISSN:1574-020X
1572-8412
1574-0218
DOI:10.1007/s10579-007-9020-1