Unsupervised discretization method for continuous attribute data based on information entropy

The invention relates to the technical field of discretization of continuous attributes of big data, in particular to an unsupervised discretization method for continuous attribute data based on information entropy. The method includes the steps as follows: the first step of traversing all value rec...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: CHEN WANGHU, GUO HONGLE, LI XINTIAN, MA SHENGJUN, QIAO BAOMIN
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator CHEN WANGHU
GUO HONGLE
LI XINTIAN
MA SHENGJUN
QIAO BAOMIN
description The invention relates to the technical field of discretization of continuous attributes of big data, in particular to an unsupervised discretization method for continuous attribute data based on information entropy. The method includes the steps as follows: the first step of traversing all value records of any continuous attribute, counting discrete granularity |nj| of the attribute and the probability qji of each different value, and recording a maximum njmax and minimum njmin; the second step of obtaining a calculating formula of the value chaos degree of any continuous attribute nj according to a calculating formula of the information entropy, and calculating the value chaos degree of the attribute according to the formula; the third step of rounding down the value chaos degree to obtain the number of break points; the fourth step of adopting an equivalent width interval method to calculate the width of each divided interval, and determining the position of each break point; and thefifth step of discretizi
format Patent
fullrecord <record><control><sourceid>epo_EVB</sourceid><recordid>TN_cdi_epo_espacenet_CN108073553A</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>CN108073553A</sourcerecordid><originalsourceid>FETCH-epo_espacenet_CN108073553A3</originalsourceid><addsrcrecordid>eNqNyzEKwkAQQNE0FqLeYTyAEFmCthIUKystJUyyExwwM8vurKCnN6IHsPrN-9PiepGUA8UHJ_LgOXWRjF9orAID2U099BqhUzGWrDkBmkVusxF4NIQWP-eoWUY4fE8Sixqe82LS4z3R4tdZsTzsz_VxRUEbSgE7ErKmPq3LbblxVeV27h_zBsQHPZ0</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>patent</recordtype></control><display><type>patent</type><title>Unsupervised discretization method for continuous attribute data based on information entropy</title><source>esp@cenet</source><creator>CHEN WANGHU ; GUO HONGLE ; LI XINTIAN ; MA SHENGJUN ; QIAO BAOMIN</creator><creatorcontrib>CHEN WANGHU ; GUO HONGLE ; LI XINTIAN ; MA SHENGJUN ; QIAO BAOMIN</creatorcontrib><description>The invention relates to the technical field of discretization of continuous attributes of big data, in particular to an unsupervised discretization method for continuous attribute data based on information entropy. The method includes the steps as follows: the first step of traversing all value records of any continuous attribute, counting discrete granularity |nj| of the attribute and the probability qji of each different value, and recording a maximum njmax and minimum njmin; the second step of obtaining a calculating formula of the value chaos degree of any continuous attribute nj according to a calculating formula of the information entropy, and calculating the value chaos degree of the attribute according to the formula; the third step of rounding down the value chaos degree to obtain the number of break points; the fourth step of adopting an equivalent width interval method to calculate the width of each divided interval, and determining the position of each break point; and thefifth step of discretizi</description><language>chi ; eng</language><subject>CALCULATING ; COMPUTING ; COUNTING ; ELECTRIC DIGITAL DATA PROCESSING ; PHYSICS</subject><creationdate>2018</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&amp;date=20180525&amp;DB=EPODOC&amp;CC=CN&amp;NR=108073553A$$EHTML$$P50$$Gepo$$Hfree_for_read</linktohtml><link.rule.ids>230,308,780,885,25563,76318</link.rule.ids><linktorsrc>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&amp;date=20180525&amp;DB=EPODOC&amp;CC=CN&amp;NR=108073553A$$EView_record_in_European_Patent_Office$$FView_record_in_$$GEuropean_Patent_Office$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>CHEN WANGHU</creatorcontrib><creatorcontrib>GUO HONGLE</creatorcontrib><creatorcontrib>LI XINTIAN</creatorcontrib><creatorcontrib>MA SHENGJUN</creatorcontrib><creatorcontrib>QIAO BAOMIN</creatorcontrib><title>Unsupervised discretization method for continuous attribute data based on information entropy</title><description>The invention relates to the technical field of discretization of continuous attributes of big data, in particular to an unsupervised discretization method for continuous attribute data based on information entropy. The method includes the steps as follows: the first step of traversing all value records of any continuous attribute, counting discrete granularity |nj| of the attribute and the probability qji of each different value, and recording a maximum njmax and minimum njmin; the second step of obtaining a calculating formula of the value chaos degree of any continuous attribute nj according to a calculating formula of the information entropy, and calculating the value chaos degree of the attribute according to the formula; the third step of rounding down the value chaos degree to obtain the number of break points; the fourth step of adopting an equivalent width interval method to calculate the width of each divided interval, and determining the position of each break point; and thefifth step of discretizi</description><subject>CALCULATING</subject><subject>COMPUTING</subject><subject>COUNTING</subject><subject>ELECTRIC DIGITAL DATA PROCESSING</subject><subject>PHYSICS</subject><fulltext>true</fulltext><rsrctype>patent</rsrctype><creationdate>2018</creationdate><recordtype>patent</recordtype><sourceid>EVB</sourceid><recordid>eNqNyzEKwkAQQNE0FqLeYTyAEFmCthIUKystJUyyExwwM8vurKCnN6IHsPrN-9PiepGUA8UHJ_LgOXWRjF9orAID2U099BqhUzGWrDkBmkVusxF4NIQWP-eoWUY4fE8Sixqe82LS4z3R4tdZsTzsz_VxRUEbSgE7ErKmPq3LbblxVeV27h_zBsQHPZ0</recordid><startdate>20180525</startdate><enddate>20180525</enddate><creator>CHEN WANGHU</creator><creator>GUO HONGLE</creator><creator>LI XINTIAN</creator><creator>MA SHENGJUN</creator><creator>QIAO BAOMIN</creator><scope>EVB</scope></search><sort><creationdate>20180525</creationdate><title>Unsupervised discretization method for continuous attribute data based on information entropy</title><author>CHEN WANGHU ; GUO HONGLE ; LI XINTIAN ; MA SHENGJUN ; QIAO BAOMIN</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-epo_espacenet_CN108073553A3</frbrgroupid><rsrctype>patents</rsrctype><prefilter>patents</prefilter><language>chi ; eng</language><creationdate>2018</creationdate><topic>CALCULATING</topic><topic>COMPUTING</topic><topic>COUNTING</topic><topic>ELECTRIC DIGITAL DATA PROCESSING</topic><topic>PHYSICS</topic><toplevel>online_resources</toplevel><creatorcontrib>CHEN WANGHU</creatorcontrib><creatorcontrib>GUO HONGLE</creatorcontrib><creatorcontrib>LI XINTIAN</creatorcontrib><creatorcontrib>MA SHENGJUN</creatorcontrib><creatorcontrib>QIAO BAOMIN</creatorcontrib><collection>esp@cenet</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>CHEN WANGHU</au><au>GUO HONGLE</au><au>LI XINTIAN</au><au>MA SHENGJUN</au><au>QIAO BAOMIN</au><format>patent</format><genre>patent</genre><ristype>GEN</ristype><title>Unsupervised discretization method for continuous attribute data based on information entropy</title><date>2018-05-25</date><risdate>2018</risdate><abstract>The invention relates to the technical field of discretization of continuous attributes of big data, in particular to an unsupervised discretization method for continuous attribute data based on information entropy. The method includes the steps as follows: the first step of traversing all value records of any continuous attribute, counting discrete granularity |nj| of the attribute and the probability qji of each different value, and recording a maximum njmax and minimum njmin; the second step of obtaining a calculating formula of the value chaos degree of any continuous attribute nj according to a calculating formula of the information entropy, and calculating the value chaos degree of the attribute according to the formula; the third step of rounding down the value chaos degree to obtain the number of break points; the fourth step of adopting an equivalent width interval method to calculate the width of each divided interval, and determining the position of each break point; and thefifth step of discretizi</abstract><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier
ispartof
issn
language chi ; eng
recordid cdi_epo_espacenet_CN108073553A
source esp@cenet
subjects CALCULATING
COMPUTING
COUNTING
ELECTRIC DIGITAL DATA PROCESSING
PHYSICS
title Unsupervised discretization method for continuous attribute data based on information entropy
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-11T21%3A32%3A01IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-epo_EVB&rft_val_fmt=info:ofi/fmt:kev:mtx:patent&rft.genre=patent&rft.au=CHEN%20WANGHU&rft.date=2018-05-25&rft_id=info:doi/&rft_dat=%3Cepo_EVB%3ECN108073553A%3C/epo_EVB%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true