Classification Evaluation and Improvement in Machine Learning Models
Persistent storage contains a training dataset and a test dataset, each with units of text labelled from a plurality of categories. A machine learning model has been trained with the training dataset to classify input units of text into the plurality of categories. One or more processors are configu...
Gespeichert in:
Hauptverfasser: | , , , , , , , |
---|---|
Format: | Patent |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Persistent storage contains a training dataset and a test dataset, each with units of text labelled from a plurality of categories. A machine learning model has been trained with the training dataset to classify input units of text into the plurality of categories. One or more processors are configured to: read the training dataset or the test dataset; determine distributional properties of the training dataset or the test dataset; determine, using the machine learning model, saliency maps for tokens in the training dataset or the test dataset; perturb, by way of token insertion, token deletion, or token replacement, the training dataset or the test dataset into an expanded dataset; obtain, using the machine learning model, classifications into the plurality of categories for the expanded dataset; and based on the distributional properties, the saliency maps, and the classifications, identify causes of failure for the machine learning model. |
---|