Weakly supervised text classification method, system and device based on self-supervised training

The invention discloses a weakly supervised text classification method, system and device based on self-supervised training. The method comprises the following steps: S1, obtaining to-be-labeled text data and a corresponding category label set; s2, obtaining a pre-training model; s3, partial weights...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: ZHONG HAOWEN, CHEN DAIYUAN, YANG FEI, YANG YI
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The invention discloses a weakly supervised text classification method, system and device based on self-supervised training. The method comprises the following steps: S1, obtaining to-be-labeled text data and a corresponding category label set; s2, obtaining a pre-training model; s3, partial weights of the pre-training model are migrated to a text classification model; s4, obtaining a text classification pseudo label through a self-supervision pseudo label strategy; the weak supervision text classification method based on self-supervision training is closer to the practical application scene of text classification, the user only needs to provide the to-be-labeled data and the category label set, and the text data labeling cost is greatly reduced. At present, many science and technology huge companies open sources of various pre-training natural language models, the models have learned general knowledge in mass information in advance, and the classification precision is ensured. And by adopting a transfer lear