Systems and Methods for the Automatic Classification of Documents
Systems and computer implemented methods for classifying documents are provided that include: pretraining and then fine tuning a machine learning model with a domain specific dataset that includes a plurality of documents each annotated with at one label selected from a plurality of predefined label...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Patent |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Systems and computer implemented methods for classifying documents are provided that include: pretraining and then fine tuning a machine learning model with a domain specific dataset that includes a plurality of documents each annotated with at one label selected from a plurality of predefined labels for a given domain; and predicting using the trained/fine tuned machine learning model, at least one label from the plurality of labels for at least one other document. The machine learning model is preferably fine tuned using a label attention multi-task learning process that includes: a first task for training the machine learning model with respect to all labels used for the plurality of documents in the dataset, and a second task for training the machine learning model with respect to a subset of all of the labels used for the plurality of documents in the dataset. |
---|