Syntactic and Sementic Based Similarity Measurenent for Plagiarism Detection

In the world of digital era, there is a high availability of huge amount of online documents which leads to plagiarism. Plagiarism is the act of copying other person work. The paper based documents are stored in the digital libraries for future references. In the olden days, people used the Latin wo...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:International journal of innovative technology and exploring engineering 2019-12, Vol.9 (2), p.155-159
Hauptverfasser: S, Sumathi, M.P, Geetha, Kumar, Dr P Ganesh, Pushpalatha, Dr K, Shanthi, Dr A S
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In the world of digital era, there is a high availability of huge amount of online documents which leads to plagiarism. Plagiarism is the act of copying other person work. The paper based documents are stored in the digital libraries for future references. In the olden days, people used the Latin word “plagiarius” to indicate the act of stealing someone else work. Plagiarism is the act of using one’s ideas, concepts, words or structures without citing their references where original work is expected from the users. In this paper, the main objective is to compare the contents of original document that matches with the contents in other documents. These matches are considered depending on the syntactic matches and also the semantic similarity. This paper employs Sentence Hashing Algorithm for Plagiarism Detection focusing on complete sentence sequences and calculates hash – sum for the sentence sequences. When the user compares the original document to several documents, if the similarity value of the document is 1, then the contents present in the original document is 100% same in the compared documents, i.e., fully plagiarized. If the similarity value varies from 0.1 to 0.9, then it is partially plagiarized. The similarity value is 0%, then the original document is unique.
ISSN:2278-3075
2278-3075
DOI:10.35940/ijitee.A5268.129219