SYNTHETIC DATA GENERATION FOR MACHINE LEARNING MODELS
A method may include generating synthetic data based on input data and training a machine learning model based on the synthetic data. The synthetic data may be generated by determining a plurality of data points representing an archetype probability distribution of a plurality of archetypes, cluster...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Patent |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | A method may include generating synthetic data based on input data and training a machine learning model based on the synthetic data. The synthetic data may be generated by determining a plurality of data points representing an archetype probability distribution of a plurality of archetypes, clustering the plurality of data points into one or more clusters associated with transactional behavior patterns, generating a threshold metric representing a peak distribution density of the plurality of data points associated with a corresponding cluster, removing, from the plurality of data points, one or more non-representative data points to define a reduced set of the plurality of data points, generating an updated archetype probability distribution based at least on the reduced set of the plurality of data points, and generating representative transaction data based on the updated archetype probability distribution and threshold metric. Related methods and articles of manufacture are al so disclosed. |
---|