Machine-learnt field-specific standardization

A system tokenizes raw values and corresponding standardized values into raw token sequences and corresponding standardized token sequences. A machine-learning model learns standardization from token insertions and token substitutions that modify the raw token sequences to match the corresponding st...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Georgiev, Stanislav, Jagota, Arun Kumar
Format:	Patent
Sprache:	eng
Schlagworte:	CALCULATING COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS COMPUTING COUNTING ELECTRIC DIGITAL DATA PROCESSING PHYSICS
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Georgiev, Stanislav Jagota, Arun Kumar
description	A system tokenizes raw values and corresponding standardized values into raw token sequences and corresponding standardized token sequences. A machine-learning model learns standardization from token insertions and token substitutions that modify the raw token sequences to match the corresponding standardized token sequences. The system tokenizes an input value into an input token sequence. The machine-learning model determines a probability of inserting an insertion token after an insertion markable token in the input token sequence. If the probability of inserting the insertion token satisfies a threshold, the system inserts the insertion token after the insertion markable token in the input token sequence. The machine-learning model determines a probability of substituting a substitution token for a substitutable token in the input token sequence. If the probability of substituting the substitution token satisfies another threshold, the system substitutes the substitution token for the substitutable token in the input token sequence.
format	Patent
fullrecord	<record><control><sourceid>epo_EVB</sourceid><recordid>TN_cdi_epo_espacenet_US11886461B2</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>US11886461B2</sourcerecordid><originalsourceid>FETCH-epo_espacenet_US11886461B23</originalsourceid><addsrcrecordid>eNrjZND1TUzOyMxL1c1JTSzKK1FIy0zNSdEtLkhNzkzLTFYoLknMS0ksSsmsSizJzM_jYWBNS8wpTuWF0twMim6uIc4euqkF-fGpxQWJyal5qSXxocGGhhYWZiZmhk5GxsSoAQDsZyq8</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>patent</recordtype></control><display><type>patent</type><title>Machine-learnt field-specific standardization</title><source>esp@cenet</source><creator>Georgiev, Stanislav ; Jagota, Arun Kumar</creator><creatorcontrib>Georgiev, Stanislav ; Jagota, Arun Kumar</creatorcontrib><description>A system tokenizes raw values and corresponding standardized values into raw token sequences and corresponding standardized token sequences. A machine-learning model learns standardization from token insertions and token substitutions that modify the raw token sequences to match the corresponding standardized token sequences. The system tokenizes an input value into an input token sequence. The machine-learning model determines a probability of inserting an insertion token after an insertion markable token in the input token sequence. If the probability of inserting the insertion token satisfies a threshold, the system inserts the insertion token after the insertion markable token in the input token sequence. The machine-learning model determines a probability of substituting a substitution token for a substitutable token in the input token sequence. If the probability of substituting the substitution token satisfies another threshold, the system substitutes the substitution token for the substitutable token in the input token sequence.</description><language>eng</language><subject>CALCULATING ; COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS ; COMPUTING ; COUNTING ; ELECTRIC DIGITAL DATA PROCESSING ; PHYSICS</subject><creationdate>2024</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20240130&DB=EPODOC&CC=US&NR=11886461B2$$EHTML$$P50$$Gepo$$Hfree_for_read</linktohtml><link.rule.ids>230,308,777,882,25545,76296</link.rule.ids><linktorsrc>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20240130&DB=EPODOC&CC=US&NR=11886461B2$$EView_record_in_European_Patent_Office$$FView_record_in_$$GEuropean_Patent_Office$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>Georgiev, Stanislav</creatorcontrib><creatorcontrib>Jagota, Arun Kumar</creatorcontrib><title>Machine-learnt field-specific standardization</title><description>A system tokenizes raw values and corresponding standardized values into raw token sequences and corresponding standardized token sequences. A machine-learning model learns standardization from token insertions and token substitutions that modify the raw token sequences to match the corresponding standardized token sequences. The system tokenizes an input value into an input token sequence. The machine-learning model determines a probability of inserting an insertion token after an insertion markable token in the input token sequence. If the probability of inserting the insertion token satisfies a threshold, the system inserts the insertion token after the insertion markable token in the input token sequence. The machine-learning model determines a probability of substituting a substitution token for a substitutable token in the input token sequence. If the probability of substituting the substitution token satisfies another threshold, the system substitutes the substitution token for the substitutable token in the input token sequence.</description><subject>CALCULATING</subject><subject>COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS</subject><subject>COMPUTING</subject><subject>COUNTING</subject><subject>ELECTRIC DIGITAL DATA PROCESSING</subject><subject>PHYSICS</subject><fulltext>true</fulltext><rsrctype>patent</rsrctype><creationdate>2024</creationdate><recordtype>patent</recordtype><sourceid>EVB</sourceid><recordid>eNrjZND1TUzOyMxL1c1JTSzKK1FIy0zNSdEtLkhNzkzLTFYoLknMS0ksSsmsSizJzM_jYWBNS8wpTuWF0twMim6uIc4euqkF-fGpxQWJyal5qSXxocGGhhYWZiZmhk5GxsSoAQDsZyq8</recordid><startdate>20240130</startdate><enddate>20240130</enddate><creator>Georgiev, Stanislav</creator><creator>Jagota, Arun Kumar</creator><scope>EVB</scope></search><sort><creationdate>20240130</creationdate><title>Machine-learnt field-specific standardization</title><author>Georgiev, Stanislav ; Jagota, Arun Kumar</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-epo_espacenet_US11886461B23</frbrgroupid><rsrctype>patents</rsrctype><prefilter>patents</prefilter><language>eng</language><creationdate>2024</creationdate><topic>CALCULATING</topic><topic>COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS</topic><topic>COMPUTING</topic><topic>COUNTING</topic><topic>ELECTRIC DIGITAL DATA PROCESSING</topic><topic>PHYSICS</topic><toplevel>online_resources</toplevel><creatorcontrib>Georgiev, Stanislav</creatorcontrib><creatorcontrib>Jagota, Arun Kumar</creatorcontrib><collection>esp@cenet</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Georgiev, Stanislav</au><au>Jagota, Arun Kumar</au><format>patent</format><genre>patent</genre><ristype>GEN</ristype><title>Machine-learnt field-specific standardization</title><date>2024-01-30</date><risdate>2024</risdate><abstract>A system tokenizes raw values and corresponding standardized values into raw token sequences and corresponding standardized token sequences. A machine-learning model learns standardization from token insertions and token substitutions that modify the raw token sequences to match the corresponding standardized token sequences. The system tokenizes an input value into an input token sequence. The machine-learning model determines a probability of inserting an insertion token after an insertion markable token in the input token sequence. If the probability of inserting the insertion token satisfies a threshold, the system inserts the insertion token after the insertion markable token in the input token sequence. The machine-learning model determines a probability of substituting a substitution token for a substitutable token in the input token sequence. If the probability of substituting the substitution token satisfies another threshold, the system substitutes the substitution token for the substitutable token in the input token sequence.</abstract><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier
ispartof
issn
language	eng
recordid	cdi_epo_espacenet_US11886461B2
source	esp@cenet
subjects	CALCULATING COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS COMPUTING COUNTING ELECTRIC DIGITAL DATA PROCESSING PHYSICS
title	Machine-learnt field-specific standardization
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-19T20%3A53%3A44IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-epo_EVB&rft_val_fmt=info:ofi/fmt:kev:mtx:patent&rft.genre=patent&rft.au=Georgiev,%20Stanislav&rft.date=2024-01-30&rft_id=info:doi/&rft_dat=%3Cepo_EVB%3EUS11886461B2%3C/epo_EVB%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true