Text new word discovery and analysis method, system and device and medium

The invention discloses a text neologism discovery and analysis method, system and device and a medium, and the method comprises the steps: obtaining text data of each industry, and obtaining an original neologism data set; performing screening based on the industry category fields of the text conte...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: DING ZHAOYUAN, WANG BIAO, ZHANG WENGUANG, ZHANG SHUJIANG, XING TIANWEI, YU JUNGAO
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The invention discloses a text neologism discovery and analysis method, system and device and a medium, and the method comprises the steps: obtaining text data of each industry, and obtaining an original neologism data set; performing screening based on the industry category fields of the text content to obtain an industry document set; word segmentation is carried out based on each piece of original neologism data in the original neologism data set, and a first candidate neologism set is determined; based on the industry document set and the first candidate new word set, determining a topic keyword prediction probability corresponding to the original new word data set through a pre-trained topic model, and updating the first candidate new word set according to the topic keyword prediction probability to determine a second candidate new word set; and performing clustering processing based on the original neologism data set and the second candidate neologism set, and determining a target candidate neologism se