Copy Detection in Chinese Documents Using Ferret
The Ferret copy detector has been used since 2001 to find plagiarism in large collections of students' coursework in English. This article reports on extending its application to Chinese, with experiments on corpora of coursework collected from two Chinese universities. Our experiments show tha...
Gespeichert in:
Veröffentlicht in: | Language Resources and Evaluation 2006-12, Vol.40 (3/4), p.357-365 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The Ferret copy detector has been used since 2001 to find plagiarism in large collections of students' coursework in English. This article reports on extending its application to Chinese, with experiments on corpora of coursework collected from two Chinese universities. Our experiments show that Ferret can find both artificially constructed plagiarism and actually occurring, previously undetected plagiarism. We discuss issues of representation, focus on the effectiveness of a sub-symbolic approach, and show that Ferret does not need to find word boundaries first. |
---|---|
ISSN: | 1574-020X 1572-8412 1574-0218 |
DOI: | 10.1007/s10579-007-9020-1 |