An efficient contextual glove feature extraction model on large textual databases
Keyphrase extraction is one of the major issues in large textual databases due to noise and high dimensional features. As the biomedical document sets' size increases, it is difficult to find and extract the keyphrases due to high computational time and memory. The traditional word embedded and...
Gespeichert in:
Veröffentlicht in: | International journal of speech technology 2022, Vol.25 (4), p.793-802 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Keyphrase extraction is one of the major issues in large textual databases due to noise and high dimensional features. As the biomedical document sets' size increases, it is difficult to find and extract the keyphrases due to high computational time and memory. The traditional word embedded and string similarity methods have significant issues such as high dimensionality, a large number of candidates sets and contextual similarity error rate in large textual datasets. To overcome these issues, a hybrid word embedded method and contextual similarity measures are necessary to find and filter the essential key phrases along with strong contextual similarity among the large candidate sets. In this paper, a hybrid glove word embed model, contextual similarity and string similarity measures are implemented on the large textual document sets for keyphrase extraction and ranking. Experimental results show that the contextual similarity of the proposed keyphrase extraction model is better than those of existing keyphrase and string similarity measures in biomedical document sets. |
---|---|
ISSN: | 1381-2416 1572-8110 |
DOI: | 10.1007/s10772-021-09884-2 |