SYNTHETIC DATA GENERATION FOR MACHINE LEARNING MODELS

A method may include generating synthetic data based on input data and training a machine learning model based on the synthetic data. The synthetic data may be generated by determining a plurality of data points representing an archetype probability distribution of a plurality of archetypes, cluster...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Rahman, Shafi Ur, Zoldi, Scott Michael
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Rahman, Shafi Ur
Zoldi, Scott Michael
description A method may include generating synthetic data based on input data and training a machine learning model based on the synthetic data. The synthetic data may be generated by determining a plurality of data points representing an archetype probability distribution of a plurality of archetypes, clustering the plurality of data points into one or more clusters associated with transactional behavior patterns, generating a threshold metric representing a peak distribution density of the plurality of data points associated with a corresponding cluster, removing, from the plurality of data points, one or more non-representative data points to define a reduced set of the plurality of data points, generating an updated archetype probability distribution based at least on the reduced set of the plurality of data points, and generating representative transaction data based on the updated archetype probability distribution and threshold metric. Related methods and articles of manufacture are al so disclosed.
format Patent
fullrecord <record><control><sourceid>epo_EVB</sourceid><recordid>TN_cdi_epo_espacenet_US2024112045A1</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>US2024112045A1</sourcerecordid><originalsourceid>FETCH-epo_espacenet_US2024112045A13</originalsourceid><addsrcrecordid>eNrjZDANjvQL8XAN8XRWcHEMcVRwd_VzDXIM8fT3U3DzD1LwdXT28PRzVfBxdQzy8_RzV_D1d3H1CeZhYE1LzClO5YXS3AzKbq4hzh66qQX58anFBYnJqXmpJfGhwUYGRiaGhkYGJqaOhsbEqQIAgy8oaw</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>patent</recordtype></control><display><type>patent</type><title>SYNTHETIC DATA GENERATION FOR MACHINE LEARNING MODELS</title><source>esp@cenet</source><creator>Rahman, Shafi Ur ; Zoldi, Scott Michael</creator><creatorcontrib>Rahman, Shafi Ur ; Zoldi, Scott Michael</creatorcontrib><description>A method may include generating synthetic data based on input data and training a machine learning model based on the synthetic data. The synthetic data may be generated by determining a plurality of data points representing an archetype probability distribution of a plurality of archetypes, clustering the plurality of data points into one or more clusters associated with transactional behavior patterns, generating a threshold metric representing a peak distribution density of the plurality of data points associated with a corresponding cluster, removing, from the plurality of data points, one or more non-representative data points to define a reduced set of the plurality of data points, generating an updated archetype probability distribution based at least on the reduced set of the plurality of data points, and generating representative transaction data based on the updated archetype probability distribution and threshold metric. Related methods and articles of manufacture are al so disclosed.</description><language>eng</language><subject>CALCULATING ; COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS ; COMPUTING ; COUNTING ; PHYSICS</subject><creationdate>2024</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&amp;date=20240404&amp;DB=EPODOC&amp;CC=US&amp;NR=2024112045A1$$EHTML$$P50$$Gepo$$Hfree_for_read</linktohtml><link.rule.ids>230,308,780,885,25563,76318</link.rule.ids><linktorsrc>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&amp;date=20240404&amp;DB=EPODOC&amp;CC=US&amp;NR=2024112045A1$$EView_record_in_European_Patent_Office$$FView_record_in_$$GEuropean_Patent_Office$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>Rahman, Shafi Ur</creatorcontrib><creatorcontrib>Zoldi, Scott Michael</creatorcontrib><title>SYNTHETIC DATA GENERATION FOR MACHINE LEARNING MODELS</title><description>A method may include generating synthetic data based on input data and training a machine learning model based on the synthetic data. The synthetic data may be generated by determining a plurality of data points representing an archetype probability distribution of a plurality of archetypes, clustering the plurality of data points into one or more clusters associated with transactional behavior patterns, generating a threshold metric representing a peak distribution density of the plurality of data points associated with a corresponding cluster, removing, from the plurality of data points, one or more non-representative data points to define a reduced set of the plurality of data points, generating an updated archetype probability distribution based at least on the reduced set of the plurality of data points, and generating representative transaction data based on the updated archetype probability distribution and threshold metric. Related methods and articles of manufacture are al so disclosed.</description><subject>CALCULATING</subject><subject>COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS</subject><subject>COMPUTING</subject><subject>COUNTING</subject><subject>PHYSICS</subject><fulltext>true</fulltext><rsrctype>patent</rsrctype><creationdate>2024</creationdate><recordtype>patent</recordtype><sourceid>EVB</sourceid><recordid>eNrjZDANjvQL8XAN8XRWcHEMcVRwd_VzDXIM8fT3U3DzD1LwdXT28PRzVfBxdQzy8_RzV_D1d3H1CeZhYE1LzClO5YXS3AzKbq4hzh66qQX58anFBYnJqXmpJfGhwUYGRiaGhkYGJqaOhsbEqQIAgy8oaw</recordid><startdate>20240404</startdate><enddate>20240404</enddate><creator>Rahman, Shafi Ur</creator><creator>Zoldi, Scott Michael</creator><scope>EVB</scope></search><sort><creationdate>20240404</creationdate><title>SYNTHETIC DATA GENERATION FOR MACHINE LEARNING MODELS</title><author>Rahman, Shafi Ur ; Zoldi, Scott Michael</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-epo_espacenet_US2024112045A13</frbrgroupid><rsrctype>patents</rsrctype><prefilter>patents</prefilter><language>eng</language><creationdate>2024</creationdate><topic>CALCULATING</topic><topic>COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS</topic><topic>COMPUTING</topic><topic>COUNTING</topic><topic>PHYSICS</topic><toplevel>online_resources</toplevel><creatorcontrib>Rahman, Shafi Ur</creatorcontrib><creatorcontrib>Zoldi, Scott Michael</creatorcontrib><collection>esp@cenet</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Rahman, Shafi Ur</au><au>Zoldi, Scott Michael</au><format>patent</format><genre>patent</genre><ristype>GEN</ristype><title>SYNTHETIC DATA GENERATION FOR MACHINE LEARNING MODELS</title><date>2024-04-04</date><risdate>2024</risdate><abstract>A method may include generating synthetic data based on input data and training a machine learning model based on the synthetic data. The synthetic data may be generated by determining a plurality of data points representing an archetype probability distribution of a plurality of archetypes, clustering the plurality of data points into one or more clusters associated with transactional behavior patterns, generating a threshold metric representing a peak distribution density of the plurality of data points associated with a corresponding cluster, removing, from the plurality of data points, one or more non-representative data points to define a reduced set of the plurality of data points, generating an updated archetype probability distribution based at least on the reduced set of the plurality of data points, and generating representative transaction data based on the updated archetype probability distribution and threshold metric. Related methods and articles of manufacture are al so disclosed.</abstract><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier
ispartof
issn
language eng
recordid cdi_epo_espacenet_US2024112045A1
source esp@cenet
subjects CALCULATING
COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
COMPUTING
COUNTING
PHYSICS
title SYNTHETIC DATA GENERATION FOR MACHINE LEARNING MODELS
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-12T16%3A09%3A10IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-epo_EVB&rft_val_fmt=info:ofi/fmt:kev:mtx:patent&rft.genre=patent&rft.au=Rahman,%20Shafi%20Ur&rft.date=2024-04-04&rft_id=info:doi/&rft_dat=%3Cepo_EVB%3EUS2024112045A1%3C/epo_EVB%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true