Corpus annotation method and device based on big data platform
The invention discloses a corpus annotation method and a device based on a big data platform. The corpus annotation method comprises the following steps: collecting a voice interaction log of a user in an artificial intelligence engine; analyzing unknown corpora in the artificial intelligence engine...
Gespeichert in:
1. Verfasser: | |
---|---|
Format: | Patent |
Sprache: | chi ; eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The invention discloses a corpus annotation method and a device based on a big data platform. The corpus annotation method comprises the following steps: collecting a voice interaction log of a user in an artificial intelligence engine; analyzing unknown corpora in the artificial intelligence engine and performing corpus classification, and dividing the unknown corpora into labeled corpora and unlabeled corpora; pushing the unlabeled corpus to a business operation platform to enable the business operation platform to label the unlabeled corpus; collecting annotation corpora, and updating the corpora in the annotation corpus according to the collection result. Unknown corpora are classified, so that the service operation platform can selectively label the classified corpora; the problems that in the prior art, pushed corpora do not have category priorities, and the efficiency of manually labeling all the pushed corpora in sequence is low are solved, the labeling efficiency is improved,and meanwhile a system ca |
---|