Labeling of data for machine learning
A computer generates labels for machine learning algorithms by retrieving, from a data storage circuit, multiple label sets that contain labels that each classify data points in a corpus of data. A graph is generated that includes a plurality of edges, each edge between two respective labels from di...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Patent |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | A computer generates labels for machine learning algorithms by retrieving, from a data storage circuit, multiple label sets that contain labels that each classify data points in a corpus of data. A graph is generated that includes a plurality of edges, each edge between two respective labels from different label sets of the multiple label sets. Weights are determined for the plurality of edges based upon a consistency between data points classified by two labels connected by the edges. An algorithm is applied that groups labels from the multiple label sets based upon the weights for the plurality of edges. Data points are identified from the corpus of data that represent conflicts within the grouped labels. An electronic message is transmitted in order to present the identified data points to entities for further classification. A new label set is generated using the further classification received from the entities. |
---|