DATA QUALITY ANALYSIS

A method includes receiving information indicative of an output dataset generated by a data processing system; identifying, based on data lineage information relating to the output dataset, one or more upstream datasets on which the output dataset depends; analyzing one or more of the identified one...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Gould, Joel, Spitz, Chuck
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Gould, Joel
Spitz, Chuck
description A method includes receiving information indicative of an output dataset generated by a data processing system; identifying, based on data lineage information relating to the output dataset, one or more upstream datasets on which the output dataset depends; analyzing one or more of the identified one or more upstream datasets on which the output dataset depends. The analyzing includes, for each particular upstream dataset of the one or more upstream datasets, applying one or more of: (i) a first rule indicative of an allowable deviation between a profile of the particular upstream dataset and a reference profile for the particular upstream dataset, and (ii) a second rule indicative of one or more allowable values or prohibited values for each of one or more data elements in the particular upstream dataset, and based on the results of applying the one or more rules, selecting one or more of the upstream datasets. The method includes outputting information associated with the selected one or more upstream datasets.
format Patent
fullrecord <record><control><sourceid>epo_EVB</sourceid><recordid>TN_cdi_epo_espacenet_US2020057757A1</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>US2020057757A1</sourcerecordid><originalsourceid>FETCH-epo_espacenet_US2020057757A13</originalsourceid><addsrcrecordid>eNrjZBB1cQxxVAgMdfTxDIlUcPRz9IkM9gzmYWBNS8wpTuWF0twMym6uIc4euqkF-fGpxQWJyal5qSXxocFGBkYGBqbm5qbmjobGxKkCABR8H90</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>patent</recordtype></control><display><type>patent</type><title>DATA QUALITY ANALYSIS</title><source>esp@cenet</source><creator>Gould, Joel ; Spitz, Chuck</creator><creatorcontrib>Gould, Joel ; Spitz, Chuck</creatorcontrib><description>A method includes receiving information indicative of an output dataset generated by a data processing system; identifying, based on data lineage information relating to the output dataset, one or more upstream datasets on which the output dataset depends; analyzing one or more of the identified one or more upstream datasets on which the output dataset depends. The analyzing includes, for each particular upstream dataset of the one or more upstream datasets, applying one or more of: (i) a first rule indicative of an allowable deviation between a profile of the particular upstream dataset and a reference profile for the particular upstream dataset, and (ii) a second rule indicative of one or more allowable values or prohibited values for each of one or more data elements in the particular upstream dataset, and based on the results of applying the one or more rules, selecting one or more of the upstream datasets. The method includes outputting information associated with the selected one or more upstream datasets.</description><language>eng</language><subject>CALCULATING ; COMPUTING ; COUNTING ; ELECTRIC DIGITAL DATA PROCESSING ; PHYSICS</subject><creationdate>2020</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&amp;date=20200220&amp;DB=EPODOC&amp;CC=US&amp;NR=2020057757A1$$EHTML$$P50$$Gepo$$Hfree_for_read</linktohtml><link.rule.ids>230,308,777,882,25545,76296</link.rule.ids><linktorsrc>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&amp;date=20200220&amp;DB=EPODOC&amp;CC=US&amp;NR=2020057757A1$$EView_record_in_European_Patent_Office$$FView_record_in_$$GEuropean_Patent_Office$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>Gould, Joel</creatorcontrib><creatorcontrib>Spitz, Chuck</creatorcontrib><title>DATA QUALITY ANALYSIS</title><description>A method includes receiving information indicative of an output dataset generated by a data processing system; identifying, based on data lineage information relating to the output dataset, one or more upstream datasets on which the output dataset depends; analyzing one or more of the identified one or more upstream datasets on which the output dataset depends. The analyzing includes, for each particular upstream dataset of the one or more upstream datasets, applying one or more of: (i) a first rule indicative of an allowable deviation between a profile of the particular upstream dataset and a reference profile for the particular upstream dataset, and (ii) a second rule indicative of one or more allowable values or prohibited values for each of one or more data elements in the particular upstream dataset, and based on the results of applying the one or more rules, selecting one or more of the upstream datasets. The method includes outputting information associated with the selected one or more upstream datasets.</description><subject>CALCULATING</subject><subject>COMPUTING</subject><subject>COUNTING</subject><subject>ELECTRIC DIGITAL DATA PROCESSING</subject><subject>PHYSICS</subject><fulltext>true</fulltext><rsrctype>patent</rsrctype><creationdate>2020</creationdate><recordtype>patent</recordtype><sourceid>EVB</sourceid><recordid>eNrjZBB1cQxxVAgMdfTxDIlUcPRz9IkM9gzmYWBNS8wpTuWF0twMym6uIc4euqkF-fGpxQWJyal5qSXxocFGBkYGBqbm5qbmjobGxKkCABR8H90</recordid><startdate>20200220</startdate><enddate>20200220</enddate><creator>Gould, Joel</creator><creator>Spitz, Chuck</creator><scope>EVB</scope></search><sort><creationdate>20200220</creationdate><title>DATA QUALITY ANALYSIS</title><author>Gould, Joel ; Spitz, Chuck</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-epo_espacenet_US2020057757A13</frbrgroupid><rsrctype>patents</rsrctype><prefilter>patents</prefilter><language>eng</language><creationdate>2020</creationdate><topic>CALCULATING</topic><topic>COMPUTING</topic><topic>COUNTING</topic><topic>ELECTRIC DIGITAL DATA PROCESSING</topic><topic>PHYSICS</topic><toplevel>online_resources</toplevel><creatorcontrib>Gould, Joel</creatorcontrib><creatorcontrib>Spitz, Chuck</creatorcontrib><collection>esp@cenet</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Gould, Joel</au><au>Spitz, Chuck</au><format>patent</format><genre>patent</genre><ristype>GEN</ristype><title>DATA QUALITY ANALYSIS</title><date>2020-02-20</date><risdate>2020</risdate><abstract>A method includes receiving information indicative of an output dataset generated by a data processing system; identifying, based on data lineage information relating to the output dataset, one or more upstream datasets on which the output dataset depends; analyzing one or more of the identified one or more upstream datasets on which the output dataset depends. The analyzing includes, for each particular upstream dataset of the one or more upstream datasets, applying one or more of: (i) a first rule indicative of an allowable deviation between a profile of the particular upstream dataset and a reference profile for the particular upstream dataset, and (ii) a second rule indicative of one or more allowable values or prohibited values for each of one or more data elements in the particular upstream dataset, and based on the results of applying the one or more rules, selecting one or more of the upstream datasets. The method includes outputting information associated with the selected one or more upstream datasets.</abstract><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier
ispartof
issn
language eng
recordid cdi_epo_espacenet_US2020057757A1
source esp@cenet
subjects CALCULATING
COMPUTING
COUNTING
ELECTRIC DIGITAL DATA PROCESSING
PHYSICS
title DATA QUALITY ANALYSIS
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-19T14%3A10%3A12IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-epo_EVB&rft_val_fmt=info:ofi/fmt:kev:mtx:patent&rft.genre=patent&rft.au=Gould,%20Joel&rft.date=2020-02-20&rft_id=info:doi/&rft_dat=%3Cepo_EVB%3EUS2020057757A1%3C/epo_EVB%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true