TOPIC MODEL AUTOMATION METHOD AND APPARATUS USING LATENT DIRICHLET ALLOCATION

Disclosed are a topic modeling method using latent Dirichlet allocation and an apparatus thereof. According to one embodiment of the present invention, the topic modeling method comprises the steps of: extracting a topic from an object to optimize the prior probability associated with the distributi...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: PARK SANG MIN, AHN JOONG WOOK, OH SEON YEONG, ON BYUNG WON
Format: Patent
Sprache:eng ; kor
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Disclosed are a topic modeling method using latent Dirichlet allocation and an apparatus thereof. According to one embodiment of the present invention, the topic modeling method comprises the steps of: extracting a topic from an object to optimize the prior probability associated with the distribution of the topic; refining the topic by dividing or merging the extracted topic based on the prior probability; and automatically labeling the refined topic by extracting a sentence representing the refined topic. 잠재 디리클레 할당을 이용한 토픽 모델링 방법 및 장치가 개시된다. 일 실시예에 따른 토픽 모델링 방법은, 객체로부터 토픽을 추출하여 상기 토픽의 분포와 관련된 사전 확률을 최적화하는 단계와, 상기 사전 확률에 기초하여 추출된 상기 토픽을 분할 또는 합병함으로써 상기 토픽을 정제하는 단계와, 정제된 토픽을 대표하는 문장을 추출함으로써 상기 정제된 토픽을 자동으로 레이블링(labeling)하는 단계를 포함한다.