An efficient contextual glove feature extraction model on large textual databases

Keyphrase extraction is one of the major issues in large textual databases due to noise and high dimensional features. As the biomedical document sets' size increases, it is difficult to find and extract the keyphrases due to high computational time and memory. The traditional word embedded and...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:International journal of speech technology 2022, Vol.25 (4), p.793-802
Hauptverfasser: Anjali Devi, S, Sivakumar, S
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Keyphrase extraction is one of the major issues in large textual databases due to noise and high dimensional features. As the biomedical document sets' size increases, it is difficult to find and extract the keyphrases due to high computational time and memory. The traditional word embedded and string similarity methods have significant issues such as high dimensionality, a large number of candidates sets and contextual similarity error rate in large textual datasets. To overcome these issues, a hybrid word embedded method and contextual similarity measures are necessary to find and filter the essential key phrases along with strong contextual similarity among the large candidate sets. In this paper, a hybrid glove word embed model, contextual similarity and string similarity measures are implemented on the large textual document sets for keyphrase extraction and ranking. Experimental results show that the contextual similarity of the proposed keyphrase extraction model is better than those of existing keyphrase and string similarity measures in biomedical document sets.
ISSN:1381-2416
1572-8110
DOI:10.1007/s10772-021-09884-2