Normalizing text attributes for machine learning models
Respective correlation metrics between token groups of a particular text attribute of a data set and a prediction target attribute are computed. Based on the correlation metrics, a predictive token group list is created. For various observation records of the data set, values of a derived categorica...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Patent |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Respective correlation metrics between token groups of a particular text attribute of a data set and a prediction target attribute are computed. Based on the correlation metrics, a predictive token group list is created. For various observation records of the data set, values of a derived categorical attribute corresponding to the particular text attribute are determined based on matches between the particular text attribute value and the predictive token group list. A measure of the predictive utility of the particular text attribute is obtained using correlations between the categorical attribute and the prediction target attribute. |
---|