Method and system for quickly classifying subdivided industry news by fusing domain knowledge
The invention provides a domain knowledge fused industry news subdivision rapid classification method and system, and relates to the field of text classification. The method comprises the following steps: S1, collecting and preprocessing news oriented to subdivided industries; s2, acquiring a first...
Gespeichert in:
Hauptverfasser: | , , , , , , |
---|---|
Format: | Patent |
Sprache: | chi ; eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The invention provides a domain knowledge fused industry news subdivision rapid classification method and system, and relates to the field of text classification. The method comprises the following steps: S1, collecting and preprocessing news oriented to subdivided industries; s2, acquiring a first named entity set corresponding to the news title by adopting a named entity recognition mode, extracting a first entity association set from a pre-constructed asymmetric entity association network for each entity in the first named entity set, and if the first entity association set is a non-empty set, turning to S3; s3, according to the first named entity set and the first entity association set, adopting a naive Bayesian algorithm to calculate the conditional probability of each classification category corresponding to the news title; and if the conditional probability is greater than a first threshold value, obtaining a preliminary classification of the news. The asymmetric relation network diagram comprises lar |
---|