AUTOMATED CLASSIFICATION OF DATASETS USING SEMANTIC TYPE INDENTIFICATION
A method for automatically classifying datasets is implemented on a computing system. A dataset is received by the computing system from a source wherein the dataset includes a plurality of data entries. The method includes the steps of: providing a plurality of predetermined semantic types; process...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Patent |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | A method for automatically classifying datasets is implemented on a computing system. A dataset is received by the computing system from a source wherein the dataset includes a plurality of data entries. The method includes the steps of: providing a plurality of predetermined semantic types; processing the data entries to identify each of the data entries as one of the semantic types, the processing including examining the data entries using two different models; generating a confidence score for each of the models based upon the examination of the data entries; generating a confidence label based upon a predetermined combination of the confidence scores; and generating a classification recommendation for the dataset based upon the identified semantic types and associating the confidence label with the dataset. |
---|