REPRESENTATIVE WORD EXTRACTION DEVICE, REPRESENTATIVE WORD EXTRACTION METHOD, AND REPRESENTATIVE WORD EXTRACTION PROGRAM

PROBLEM TO BE SOLVED: To extract a word that represents a document group without depending upon the number of documents included in the document group.SOLUTION: A preprocessing part 11 collects document groups including a target document group to be a target to extract a representative word, and a r...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: KOBAYASHI TORU, NAGANO SHOICHI, ICHIKAWA YUSUKE
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:PROBLEM TO BE SOLVED: To extract a word that represents a document group without depending upon the number of documents included in the document group.SOLUTION: A preprocessing part 11 collects document groups including a target document group to be a target to extract a representative word, and a reference word acquiring part 13 acquires a reference word to be reference to extract the representative word. A reference document specifying part 14 specifies a reference document including the reference word from the document groups inputted from the preprocessing part 11, and a word group extracting part 15 extracts the reference word and words other than the reference word as a word group from the reference document. An index calculating part 16 calculates an index whose value increases or decreases in accordance with the magnitude of the co-occurrence frequency with the reference word for each word of the extracted word group. Then, an index correcting part 17 calculates the degree of rarity in the whole document groups and the degree of rarity in a target document group for each word of the extracted word group, and corrects the index calculated by the index calculating part 16 by using the calculated two degrees of rarity.