DATA GAP MITIGATION
Disclosed embodiments provide techniques for estimating imputation algorithm performance. Multiple imputer algorithms are selected, and an evaluation of how well each of the imputer algorithms can estimate the missing data is performed. Disclosed embodiments obtain an imputer candidate dataset (ICD)...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Patent |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Yashchin, Emmanuel Iyengar, Arun Kwangil Patel, Dhavalkumar C Bhamidipaty, Anuradha Zhou, Nianjun Shrivastava, Shrey |
description | Disclosed embodiments provide techniques for estimating imputation algorithm performance. Multiple imputer algorithms are selected, and an evaluation of how well each of the imputer algorithms can estimate the missing data is performed. Disclosed embodiments obtain an imputer candidate dataset (ICD). The imputer candidate dataset is compared to the incomplete data range, and a similarity metric is determined between the data range and the ICD. When the similarity metric exceeds a predetermined threshold, an imputer evaluation dataset (IED) is created from the ICD by removing one or more data points from the ICD. Each imputer algorithm is evaluated by applying the IED to it, and computing an imputer evaluation metric based on its performance. The multiple imputer algorithms are ranked based on the imputer evaluation metric. The best ranked imputer algorithm can then be selected for use on the incomplete data range within the measurement dataset. |
format | Patent |
fullrecord | <record><control><sourceid>epo_EVB</sourceid><recordid>TN_cdi_epo_espacenet_US2024152492A1</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>US2024152492A1</sourcerecordid><originalsourceid>FETCH-epo_espacenet_US2024152492A13</originalsourceid><addsrcrecordid>eNrjZBB2cQxxVHB3DFDw9QzxdHcM8fT342FgTUvMKU7lhdLcDMpuriHOHrqpBfnxqcUFicmpeakl8aHBRgZGJoamRiaWRo6GxsSpAgDAPR8T</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>patent</recordtype></control><display><type>patent</type><title>DATA GAP MITIGATION</title><source>esp@cenet</source><creator>Yashchin, Emmanuel ; Iyengar, Arun Kwangil ; Patel, Dhavalkumar C ; Bhamidipaty, Anuradha ; Zhou, Nianjun ; Shrivastava, Shrey</creator><creatorcontrib>Yashchin, Emmanuel ; Iyengar, Arun Kwangil ; Patel, Dhavalkumar C ; Bhamidipaty, Anuradha ; Zhou, Nianjun ; Shrivastava, Shrey</creatorcontrib><description>Disclosed embodiments provide techniques for estimating imputation algorithm performance. Multiple imputer algorithms are selected, and an evaluation of how well each of the imputer algorithms can estimate the missing data is performed. Disclosed embodiments obtain an imputer candidate dataset (ICD). The imputer candidate dataset is compared to the incomplete data range, and a similarity metric is determined between the data range and the ICD. When the similarity metric exceeds a predetermined threshold, an imputer evaluation dataset (IED) is created from the ICD by removing one or more data points from the ICD. Each imputer algorithm is evaluated by applying the IED to it, and computing an imputer evaluation metric based on its performance. The multiple imputer algorithms are ranked based on the imputer evaluation metric. The best ranked imputer algorithm can then be selected for use on the incomplete data range within the measurement dataset.</description><language>eng</language><subject>CALCULATING ; COMPUTING ; COUNTING ; ELECTRIC DIGITAL DATA PROCESSING ; PHYSICS</subject><creationdate>2024</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20240509&DB=EPODOC&CC=US&NR=2024152492A1$$EHTML$$P50$$Gepo$$Hfree_for_read</linktohtml><link.rule.ids>230,308,777,882,25545,76296</link.rule.ids><linktorsrc>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20240509&DB=EPODOC&CC=US&NR=2024152492A1$$EView_record_in_European_Patent_Office$$FView_record_in_$$GEuropean_Patent_Office$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>Yashchin, Emmanuel</creatorcontrib><creatorcontrib>Iyengar, Arun Kwangil</creatorcontrib><creatorcontrib>Patel, Dhavalkumar C</creatorcontrib><creatorcontrib>Bhamidipaty, Anuradha</creatorcontrib><creatorcontrib>Zhou, Nianjun</creatorcontrib><creatorcontrib>Shrivastava, Shrey</creatorcontrib><title>DATA GAP MITIGATION</title><description>Disclosed embodiments provide techniques for estimating imputation algorithm performance. Multiple imputer algorithms are selected, and an evaluation of how well each of the imputer algorithms can estimate the missing data is performed. Disclosed embodiments obtain an imputer candidate dataset (ICD). The imputer candidate dataset is compared to the incomplete data range, and a similarity metric is determined between the data range and the ICD. When the similarity metric exceeds a predetermined threshold, an imputer evaluation dataset (IED) is created from the ICD by removing one or more data points from the ICD. Each imputer algorithm is evaluated by applying the IED to it, and computing an imputer evaluation metric based on its performance. The multiple imputer algorithms are ranked based on the imputer evaluation metric. The best ranked imputer algorithm can then be selected for use on the incomplete data range within the measurement dataset.</description><subject>CALCULATING</subject><subject>COMPUTING</subject><subject>COUNTING</subject><subject>ELECTRIC DIGITAL DATA PROCESSING</subject><subject>PHYSICS</subject><fulltext>true</fulltext><rsrctype>patent</rsrctype><creationdate>2024</creationdate><recordtype>patent</recordtype><sourceid>EVB</sourceid><recordid>eNrjZBB2cQxxVHB3DFDw9QzxdHcM8fT342FgTUvMKU7lhdLcDMpuriHOHrqpBfnxqcUFicmpeakl8aHBRgZGJoamRiaWRo6GxsSpAgDAPR8T</recordid><startdate>20240509</startdate><enddate>20240509</enddate><creator>Yashchin, Emmanuel</creator><creator>Iyengar, Arun Kwangil</creator><creator>Patel, Dhavalkumar C</creator><creator>Bhamidipaty, Anuradha</creator><creator>Zhou, Nianjun</creator><creator>Shrivastava, Shrey</creator><scope>EVB</scope></search><sort><creationdate>20240509</creationdate><title>DATA GAP MITIGATION</title><author>Yashchin, Emmanuel ; Iyengar, Arun Kwangil ; Patel, Dhavalkumar C ; Bhamidipaty, Anuradha ; Zhou, Nianjun ; Shrivastava, Shrey</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-epo_espacenet_US2024152492A13</frbrgroupid><rsrctype>patents</rsrctype><prefilter>patents</prefilter><language>eng</language><creationdate>2024</creationdate><topic>CALCULATING</topic><topic>COMPUTING</topic><topic>COUNTING</topic><topic>ELECTRIC DIGITAL DATA PROCESSING</topic><topic>PHYSICS</topic><toplevel>online_resources</toplevel><creatorcontrib>Yashchin, Emmanuel</creatorcontrib><creatorcontrib>Iyengar, Arun Kwangil</creatorcontrib><creatorcontrib>Patel, Dhavalkumar C</creatorcontrib><creatorcontrib>Bhamidipaty, Anuradha</creatorcontrib><creatorcontrib>Zhou, Nianjun</creatorcontrib><creatorcontrib>Shrivastava, Shrey</creatorcontrib><collection>esp@cenet</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Yashchin, Emmanuel</au><au>Iyengar, Arun Kwangil</au><au>Patel, Dhavalkumar C</au><au>Bhamidipaty, Anuradha</au><au>Zhou, Nianjun</au><au>Shrivastava, Shrey</au><format>patent</format><genre>patent</genre><ristype>GEN</ristype><title>DATA GAP MITIGATION</title><date>2024-05-09</date><risdate>2024</risdate><abstract>Disclosed embodiments provide techniques for estimating imputation algorithm performance. Multiple imputer algorithms are selected, and an evaluation of how well each of the imputer algorithms can estimate the missing data is performed. Disclosed embodiments obtain an imputer candidate dataset (ICD). The imputer candidate dataset is compared to the incomplete data range, and a similarity metric is determined between the data range and the ICD. When the similarity metric exceeds a predetermined threshold, an imputer evaluation dataset (IED) is created from the ICD by removing one or more data points from the ICD. Each imputer algorithm is evaluated by applying the IED to it, and computing an imputer evaluation metric based on its performance. The multiple imputer algorithms are ranked based on the imputer evaluation metric. The best ranked imputer algorithm can then be selected for use on the incomplete data range within the measurement dataset.</abstract><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | |
ispartof | |
issn | |
language | eng |
recordid | cdi_epo_espacenet_US2024152492A1 |
source | esp@cenet |
subjects | CALCULATING COMPUTING COUNTING ELECTRIC DIGITAL DATA PROCESSING PHYSICS |
title | DATA GAP MITIGATION |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-21T02%3A01%3A28IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-epo_EVB&rft_val_fmt=info:ofi/fmt:kev:mtx:patent&rft.genre=patent&rft.au=Yashchin,%20Emmanuel&rft.date=2024-05-09&rft_id=info:doi/&rft_dat=%3Cepo_EVB%3EUS2024152492A1%3C/epo_EVB%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |