Designing an efficient unigram keyword detector for documents using Relative Entropy

In this work we propose a statistical approach to identify unigram keywords for a document. We identify unigram keywords as features which effectively captures the importance of a word in a document and evaluates its potential to be a keyword. We make use of relative entropy, displacement and varian...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Multimedia tools and applications 2022-11, Vol.81 (26), p.37747-37761
Hauptverfasser: Rathi, R. N., Mustafi, A.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In this work we propose a statistical approach to identify unigram keywords for a document. We identify unigram keywords as features which effectively captures the importance of a word in a document and evaluates its potential to be a keyword. We make use of relative entropy, displacement and variance of terms in a document have been evaluated in the context of keyword identification. The proposed approach works on single documents without the requirement of any pre-training of the model. We also evaluate the effectiveness of our features against the gold standard of “term frequency” and compare the usefulness of the proposed feature set with term frequency. The results of our proposed method are presented and compared with existing algorithms.
ISSN:1380-7501
1573-7721
DOI:10.1007/s11042-022-12657-x