Method and system for quickly classifying subdivided industry news by fusing domain knowledge

The invention provides a domain knowledge fused industry news subdivision rapid classification method and system, and relates to the field of text classification. The method comprises the following steps: S1, collecting and preprocessing news oriented to subdivided industries; s2, acquiring a first...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: MA TAO, WANG JUNJIE, WANG ANNING, JIA ZIYAO, DENG YUNCHONG, ZHANG QIANG, DING JIAMING
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The invention provides a domain knowledge fused industry news subdivision rapid classification method and system, and relates to the field of text classification. The method comprises the following steps: S1, collecting and preprocessing news oriented to subdivided industries; s2, acquiring a first named entity set corresponding to the news title by adopting a named entity recognition mode, extracting a first entity association set from a pre-constructed asymmetric entity association network for each entity in the first named entity set, and if the first entity association set is a non-empty set, turning to S3; s3, according to the first named entity set and the first entity association set, adopting a naive Bayesian algorithm to calculate the conditional probability of each classification category corresponding to the news title; and if the conditional probability is greater than a first threshold value, obtaining a preliminary classification of the news. The asymmetric relation network diagram comprises lar