DATA GAP MITIGATION

Disclosed embodiments provide techniques for estimating imputation algorithm performance. Multiple imputer algorithms are selected, and an evaluation of how well each of the imputer algorithms can estimate the missing data is performed. Disclosed embodiments obtain an imputer candidate dataset (ICD)...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Yashchin, Emmanuel, Iyengar, Arun Kwangil, Patel, Dhavalkumar C, Bhamidipaty, Anuradha, Zhou, Nianjun, Shrivastava, Shrey
Format:	Patent
Sprache:	eng
Schlagworte:	CALCULATING COMPUTING COUNTING ELECTRIC DIGITAL DATA PROCESSING PHYSICS
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Yashchin, Emmanuel Iyengar, Arun Kwangil Patel, Dhavalkumar C Bhamidipaty, Anuradha Zhou, Nianjun Shrivastava, Shrey
description	Disclosed embodiments provide techniques for estimating imputation algorithm performance. Multiple imputer algorithms are selected, and an evaluation of how well each of the imputer algorithms can estimate the missing data is performed. Disclosed embodiments obtain an imputer candidate dataset (ICD). The imputer candidate dataset is compared to the incomplete data range, and a similarity metric is determined between the data range and the ICD. When the similarity metric exceeds a predetermined threshold, an imputer evaluation dataset (IED) is created from the ICD by removing one or more data points from the ICD. Each imputer algorithm is evaluated by applying the IED to it, and computing an imputer evaluation metric based on its performance. The multiple imputer algorithms are ranked based on the imputer evaluation metric. The best ranked imputer algorithm can then be selected for use on the incomplete data range within the measurement dataset.
format	Patent
fullrecord	<record><control><sourceid>epo_EVB</sourceid><recordid>TN_cdi_epo_espacenet_US2024152492A1</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>US2024152492A1</sourcerecordid><originalsourceid>FETCH-epo_espacenet_US2024152492A13</originalsourceid><addsrcrecordid>eNrjZBB2cQxxVHB3DFDw9QzxdHcM8fT342FgTUvMKU7lhdLcDMpuriHOHrqpBfnxqcUFicmpeakl8aHBRgZGJoamRiaWRo6GxsSpAgDAPR8T</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>patent</recordtype></control><display><type>patent</type><title>DATA GAP MITIGATION</title><source>esp@cenet</source><creator>Yashchin, Emmanuel ; Iyengar, Arun Kwangil ; Patel, Dhavalkumar C ; Bhamidipaty, Anuradha ; Zhou, Nianjun ; Shrivastava, Shrey</creator><creatorcontrib>Yashchin, Emmanuel ; Iyengar, Arun Kwangil ; Patel, Dhavalkumar C ; Bhamidipaty, Anuradha ; Zhou, Nianjun ; Shrivastava, Shrey</creatorcontrib><description>Disclosed embodiments provide techniques for estimating imputation algorithm performance. Multiple imputer algorithms are selected, and an evaluation of how well each of the imputer algorithms can estimate the missing data is performed. Disclosed embodiments obtain an imputer candidate dataset (ICD). The imputer candidate dataset is compared to the incomplete data range, and a similarity metric is determined between the data range and the ICD. When the similarity metric exceeds a predetermined threshold, an imputer evaluation dataset (IED) is created from the ICD by removing one or more data points from the ICD. Each imputer algorithm is evaluated by applying the IED to it, and computing an imputer evaluation metric based on its performance. The multiple imputer algorithms are ranked based on the imputer evaluation metric. The best ranked imputer algorithm can then be selected for use on the incomplete data range within the measurement dataset.</description><language>eng</language><subject>CALCULATING ; COMPUTING ; COUNTING ; ELECTRIC DIGITAL DATA PROCESSING ; PHYSICS</subject><creationdate>2024</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20240509&DB=EPODOC&CC=US&NR=2024152492A1$$EHTML$$P50$$Gepo$$Hfree_for_read</linktohtml><link.rule.ids>230,308,777,882,25545,76296</link.rule.ids><linktorsrc>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20240509&DB=EPODOC&CC=US&NR=2024152492A1$$EView_record_in_European_Patent_Office$$FView_record_in_$$GEuropean_Patent_Office$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>Yashchin, Emmanuel</creatorcontrib><creatorcontrib>Iyengar, Arun Kwangil</creatorcontrib><creatorcontrib>Patel, Dhavalkumar C</creatorcontrib><creatorcontrib>Bhamidipaty, Anuradha</creatorcontrib><creatorcontrib>Zhou, Nianjun</creatorcontrib><creatorcontrib>Shrivastava, Shrey</creatorcontrib><title>DATA GAP MITIGATION</title><description>Disclosed embodiments provide techniques for estimating imputation algorithm performance. Multiple imputer algorithms are selected, and an evaluation of how well each of the imputer algorithms can estimate the missing data is performed. Disclosed embodiments obtain an imputer candidate dataset (ICD). The imputer candidate dataset is compared to the incomplete data range, and a similarity metric is determined between the data range and the ICD. When the similarity metric exceeds a predetermined threshold, an imputer evaluation dataset (IED) is created from the ICD by removing one or more data points from the ICD. Each imputer algorithm is evaluated by applying the IED to it, and computing an imputer evaluation metric based on its performance. The multiple imputer algorithms are ranked based on the imputer evaluation metric. The best ranked imputer algorithm can then be selected for use on the incomplete data range within the measurement dataset.</description><subject>CALCULATING</subject><subject>COMPUTING</subject><subject>COUNTING</subject><subject>ELECTRIC DIGITAL DATA PROCESSING</subject><subject>PHYSICS</subject><fulltext>true</fulltext><rsrctype>patent</rsrctype><creationdate>2024</creationdate><recordtype>patent</recordtype><sourceid>EVB</sourceid><recordid>eNrjZBB2cQxxVHB3DFDw9QzxdHcM8fT342FgTUvMKU7lhdLcDMpuriHOHrqpBfnxqcUFicmpeakl8aHBRgZGJoamRiaWRo6GxsSpAgDAPR8T</recordid><startdate>20240509</startdate><enddate>20240509</enddate><creator>Yashchin, Emmanuel</creator><creator>Iyengar, Arun Kwangil</creator><creator>Patel, Dhavalkumar C</creator><creator>Bhamidipaty, Anuradha</creator><creator>Zhou, Nianjun</creator><creator>Shrivastava, Shrey</creator><scope>EVB</scope></search><sort><creationdate>20240509</creationdate><title>DATA GAP MITIGATION</title><author>Yashchin, Emmanuel ; Iyengar, Arun Kwangil ; Patel, Dhavalkumar C ; Bhamidipaty, Anuradha ; Zhou, Nianjun ; Shrivastava, Shrey</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-epo_espacenet_US2024152492A13</frbrgroupid><rsrctype>patents</rsrctype><prefilter>patents</prefilter><language>eng</language><creationdate>2024</creationdate><topic>CALCULATING</topic><topic>COMPUTING</topic><topic>COUNTING</topic><topic>ELECTRIC DIGITAL DATA PROCESSING</topic><topic>PHYSICS</topic><toplevel>online_resources</toplevel><creatorcontrib>Yashchin, Emmanuel</creatorcontrib><creatorcontrib>Iyengar, Arun Kwangil</creatorcontrib><creatorcontrib>Patel, Dhavalkumar C</creatorcontrib><creatorcontrib>Bhamidipaty, Anuradha</creatorcontrib><creatorcontrib>Zhou, Nianjun</creatorcontrib><creatorcontrib>Shrivastava, Shrey</creatorcontrib><collection>esp@cenet</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Yashchin, Emmanuel</au><au>Iyengar, Arun Kwangil</au><au>Patel, Dhavalkumar C</au><au>Bhamidipaty, Anuradha</au><au>Zhou, Nianjun</au><au>Shrivastava, Shrey</au><format>patent</format><genre>patent</genre><ristype>GEN</ristype><title>DATA GAP MITIGATION</title><date>2024-05-09</date><risdate>2024</risdate><abstract>Disclosed embodiments provide techniques for estimating imputation algorithm performance. Multiple imputer algorithms are selected, and an evaluation of how well each of the imputer algorithms can estimate the missing data is performed. Disclosed embodiments obtain an imputer candidate dataset (ICD). The imputer candidate dataset is compared to the incomplete data range, and a similarity metric is determined between the data range and the ICD. When the similarity metric exceeds a predetermined threshold, an imputer evaluation dataset (IED) is created from the ICD by removing one or more data points from the ICD. Each imputer algorithm is evaluated by applying the IED to it, and computing an imputer evaluation metric based on its performance. The multiple imputer algorithms are ranked based on the imputer evaluation metric. The best ranked imputer algorithm can then be selected for use on the incomplete data range within the measurement dataset.</abstract><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier
ispartof
issn
language	eng
recordid	cdi_epo_espacenet_US2024152492A1
source	esp@cenet
subjects	CALCULATING COMPUTING COUNTING ELECTRIC DIGITAL DATA PROCESSING PHYSICS
title	DATA GAP MITIGATION
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-21T02%3A01%3A28IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-epo_EVB&rft_val_fmt=info:ofi/fmt:kev:mtx:patent&rft.genre=patent&rft.au=Yashchin,%20Emmanuel&rft.date=2024-05-09&rft_id=info:doi/&rft_dat=%3Cepo_EVB%3EUS2024152492A1%3C/epo_EVB%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true