REPRESENTATIVE WORD EXTRACTION DEVICE, REPRESENTATIVE WORD EXTRACTION METHOD, AND REPRESENTATIVE WORD EXTRACTION PROGRAM
PROBLEM TO BE SOLVED: To extract a word that represents a document group without depending upon the number of documents included in the document group.SOLUTION: A preprocessing part 11 collects document groups including a target document group to be a target to extract a representative word, and a r...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Patent |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | PROBLEM TO BE SOLVED: To extract a word that represents a document group without depending upon the number of documents included in the document group.SOLUTION: A preprocessing part 11 collects document groups including a target document group to be a target to extract a representative word, and a reference word acquiring part 13 acquires a reference word to be reference to extract the representative word. A reference document specifying part 14 specifies a reference document including the reference word from the document groups inputted from the preprocessing part 11, and a word group extracting part 15 extracts the reference word and words other than the reference word as a word group from the reference document. An index calculating part 16 calculates an index whose value increases or decreases in accordance with the magnitude of the co-occurrence frequency with the reference word for each word of the extracted word group. Then, an index correcting part 17 calculates the degree of rarity in the whole document groups and the degree of rarity in a target document group for each word of the extracted word group, and corrects the index calculated by the index calculating part 16 by using the calculated two degrees of rarity. |
---|