Using a data mining algorithm to generate format rules used to validate data sets

Provided are a method, system, and article of manufacture for using a data mining algorithm to generate format rules used to validate data sets. A data set has a plurality of columns and records providing data for each of the columns. Selection is received of at least one format column for which for...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: MEEKS DAVID THOMAS, SAILLET YANNICK, ROTH MARY ANN, LABRIE JACQUES JOSEPH
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator MEEKS DAVID THOMAS
SAILLET YANNICK
ROTH MARY ANN
LABRIE JACQUES JOSEPH
description Provided are a method, system, and article of manufacture for using a data mining algorithm to generate format rules used to validate data sets. A data set has a plurality of columns and records providing data for each of the columns. Selection is received of at least one format column for which format rules are to be generated and selection is received of at least one predictor column. A format mask column is generated for each selected format column. For records in the data set, a value in the at least one format column is converted to a format mask representing a format of the value in the format column and storing the format mask in the format mask column in the record for which the format mask was generated. The at least one predictor column and the at least one format mask column are processed to generate at least one format rule. Each format rule specifies a format mask associated with at least one condition in the at least one predictor column.
format Patent
fullrecord <record><control><sourceid>epo_EVB</sourceid><recordid>TN_cdi_epo_espacenet_US8166000B2</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>US8166000B2</sourcerecordid><originalsourceid>FETCH-epo_espacenet_US8166000B23</originalsourceid><addsrcrecordid>eNrjZAgMLc7MS1dIVEhJLElUyM3MA_Ny0vOLMksychVK8hXSU_NSixJLUhXS8otyE0sUikpzUosVSotTU0CyZYk5mSkgWbD-4tSSYh4G1rTEnOJUXijNzaDg5hri7KGbWpAfn1pckJgMNLAkPjTYwtDMzMDAwMnImAglADDxNvc</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>patent</recordtype></control><display><type>patent</type><title>Using a data mining algorithm to generate format rules used to validate data sets</title><source>esp@cenet</source><creator>MEEKS DAVID THOMAS ; SAILLET YANNICK ; ROTH MARY ANN ; LABRIE JACQUES JOSEPH</creator><creatorcontrib>MEEKS DAVID THOMAS ; SAILLET YANNICK ; ROTH MARY ANN ; LABRIE JACQUES JOSEPH</creatorcontrib><description>Provided are a method, system, and article of manufacture for using a data mining algorithm to generate format rules used to validate data sets. A data set has a plurality of columns and records providing data for each of the columns. Selection is received of at least one format column for which format rules are to be generated and selection is received of at least one predictor column. A format mask column is generated for each selected format column. For records in the data set, a value in the at least one format column is converted to a format mask representing a format of the value in the format column and storing the format mask in the format mask column in the record for which the format mask was generated. The at least one predictor column and the at least one format mask column are processed to generate at least one format rule. Each format rule specifies a format mask associated with at least one condition in the at least one predictor column.</description><language>eng</language><subject>CALCULATING ; COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS ; COMPUTING ; COUNTING ; ELECTRIC DIGITAL DATA PROCESSING ; PHYSICS</subject><creationdate>2012</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&amp;date=20120424&amp;DB=EPODOC&amp;CC=US&amp;NR=8166000B2$$EHTML$$P50$$Gepo$$Hfree_for_read</linktohtml><link.rule.ids>230,309,781,886,25566,76549</link.rule.ids><linktorsrc>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&amp;date=20120424&amp;DB=EPODOC&amp;CC=US&amp;NR=8166000B2$$EView_record_in_European_Patent_Office$$FView_record_in_$$GEuropean_Patent_Office$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>MEEKS DAVID THOMAS</creatorcontrib><creatorcontrib>SAILLET YANNICK</creatorcontrib><creatorcontrib>ROTH MARY ANN</creatorcontrib><creatorcontrib>LABRIE JACQUES JOSEPH</creatorcontrib><title>Using a data mining algorithm to generate format rules used to validate data sets</title><description>Provided are a method, system, and article of manufacture for using a data mining algorithm to generate format rules used to validate data sets. A data set has a plurality of columns and records providing data for each of the columns. Selection is received of at least one format column for which format rules are to be generated and selection is received of at least one predictor column. A format mask column is generated for each selected format column. For records in the data set, a value in the at least one format column is converted to a format mask representing a format of the value in the format column and storing the format mask in the format mask column in the record for which the format mask was generated. The at least one predictor column and the at least one format mask column are processed to generate at least one format rule. Each format rule specifies a format mask associated with at least one condition in the at least one predictor column.</description><subject>CALCULATING</subject><subject>COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS</subject><subject>COMPUTING</subject><subject>COUNTING</subject><subject>ELECTRIC DIGITAL DATA PROCESSING</subject><subject>PHYSICS</subject><fulltext>true</fulltext><rsrctype>patent</rsrctype><creationdate>2012</creationdate><recordtype>patent</recordtype><sourceid>EVB</sourceid><recordid>eNrjZAgMLc7MS1dIVEhJLElUyM3MA_Ny0vOLMksychVK8hXSU_NSixJLUhXS8otyE0sUikpzUosVSotTU0CyZYk5mSkgWbD-4tSSYh4G1rTEnOJUXijNzaDg5hri7KGbWpAfn1pckJgMNLAkPjTYwtDMzMDAwMnImAglADDxNvc</recordid><startdate>20120424</startdate><enddate>20120424</enddate><creator>MEEKS DAVID THOMAS</creator><creator>SAILLET YANNICK</creator><creator>ROTH MARY ANN</creator><creator>LABRIE JACQUES JOSEPH</creator><scope>EVB</scope></search><sort><creationdate>20120424</creationdate><title>Using a data mining algorithm to generate format rules used to validate data sets</title><author>MEEKS DAVID THOMAS ; SAILLET YANNICK ; ROTH MARY ANN ; LABRIE JACQUES JOSEPH</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-epo_espacenet_US8166000B23</frbrgroupid><rsrctype>patents</rsrctype><prefilter>patents</prefilter><language>eng</language><creationdate>2012</creationdate><topic>CALCULATING</topic><topic>COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS</topic><topic>COMPUTING</topic><topic>COUNTING</topic><topic>ELECTRIC DIGITAL DATA PROCESSING</topic><topic>PHYSICS</topic><toplevel>online_resources</toplevel><creatorcontrib>MEEKS DAVID THOMAS</creatorcontrib><creatorcontrib>SAILLET YANNICK</creatorcontrib><creatorcontrib>ROTH MARY ANN</creatorcontrib><creatorcontrib>LABRIE JACQUES JOSEPH</creatorcontrib><collection>esp@cenet</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>MEEKS DAVID THOMAS</au><au>SAILLET YANNICK</au><au>ROTH MARY ANN</au><au>LABRIE JACQUES JOSEPH</au><format>patent</format><genre>patent</genre><ristype>GEN</ristype><title>Using a data mining algorithm to generate format rules used to validate data sets</title><date>2012-04-24</date><risdate>2012</risdate><abstract>Provided are a method, system, and article of manufacture for using a data mining algorithm to generate format rules used to validate data sets. A data set has a plurality of columns and records providing data for each of the columns. Selection is received of at least one format column for which format rules are to be generated and selection is received of at least one predictor column. A format mask column is generated for each selected format column. For records in the data set, a value in the at least one format column is converted to a format mask representing a format of the value in the format column and storing the format mask in the format mask column in the record for which the format mask was generated. The at least one predictor column and the at least one format mask column are processed to generate at least one format rule. Each format rule specifies a format mask associated with at least one condition in the at least one predictor column.</abstract><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier
ispartof
issn
language eng
recordid cdi_epo_espacenet_US8166000B2
source esp@cenet
subjects CALCULATING
COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
COMPUTING
COUNTING
ELECTRIC DIGITAL DATA PROCESSING
PHYSICS
title Using a data mining algorithm to generate format rules used to validate data sets
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-17T19%3A41%3A33IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-epo_EVB&rft_val_fmt=info:ofi/fmt:kev:mtx:patent&rft.genre=patent&rft.au=MEEKS%20DAVID%20THOMAS&rft.date=2012-04-24&rft_id=info:doi/&rft_dat=%3Cepo_EVB%3EUS8166000B2%3C/epo_EVB%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true