CLASSIFICATION METHOD, APPARATUS, AND PROGRAM

To improve the accuracy of classifying texts.SOLUTION: A receiving/analysis section 12 acquires multiple pieces of analysis result information each including a set of morphemes included in a text and attribute information of the morphemes, for any text out of received multiple texts. A dividing sect...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: KUDO JUMMA, YAMAKOSHI KOTA, MIYAGI TOSHIHIDE, HIROTA KEISUKE, HANAWA DAIKI
Format: Patent
Sprache:eng ; jpn
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:To improve the accuracy of classifying texts.SOLUTION: A receiving/analysis section 12 acquires multiple pieces of analysis result information each including a set of morphemes included in a text and attribute information of the morphemes, for any text out of received multiple texts. A dividing section 14 refers to a storage section storing morpheme information including a specific morpheme and attribute information of the specific morpheme, to determine whether a set of the specific morpheme included in the morpheme information and the attribute information of the specific morpheme is included in the acquired multiple pieces of analysis result information. When a result of determination is affirmative, the dividing section divides the text in a position, corresponding to an appearance position of a morpheme included in one piece of the analysis result information in any text, to generate multiple texts. A classification section 16 classifies the generated texts and other texts of the other received texts into multiple clusters.SELECTED DRAWING: Figure 9 【課題】テキストの分類精度を向上させる。【解決手段】受付解析部12が、受け付けた複数のテキストのうちの何れかのテキストについて、テキストに含まれる形態素と、形態素の属性情報との組をそれぞれが含む複数の解析結果情報を取得し、分割部14が、特定の形態素と、特定の形態素の属性情報とを含む形態素情報を記憶する記憶部を参照して、取得した複数の解析結果情報に、形態素情報に含まれる特定の形態素と、特定の形態素の属性情報との組が含まれるか否かの判定を行い、判定結果が肯定的である場合、何れかのテキストを、何れかのテキストにおける、何れかの解析結果情報に含まれる形態素の出現位置に応じた位置で分割して、複数のテキストを生成し、分類部16が、受け付けた複数のテキストのうちの他のテキストと、生成した複数のテキストと、を複数のクラスタに分類する。【選択図】図9