GENERATING A PERSONAL CORPUS

In an approach for generating a user-specific personal corpus, a processor creates a basic corpus for a first user using a first set of data sources, wherein the basic corpus includes one or more basic words and one or more vectors of the one or more basic words. A processor extracts a set of text f...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: WATANABE, KENTA, Fukuda, Takashi, MIYAMOTO, TAIHEI, Tashiro, Takahito
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In an approach for generating a user-specific personal corpus, a processor creates a basic corpus for a first user using a first set of data sources, wherein the basic corpus includes one or more basic words and one or more vectors of the one or more basic words. A processor extracts a set of text from a second set of data sources associated with the first user. Responsive to finding an unknown word included in the set of text extracted, a processor updates the basic corpus, wherein the basic corpus is updated by replacing a vector of the unknown word with an average vector of the one or more basic words in the basic corpus created and registering the unknown word in a first personal corpus.