Unsupervised discretization method for continuous attribute data based on information entropy
The invention relates to the technical field of discretization of continuous attributes of big data, in particular to an unsupervised discretization method for continuous attribute data based on information entropy. The method includes the steps as follows: the first step of traversing all value rec...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Patent |
Sprache: | chi ; eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | CHEN WANGHU GUO HONGLE LI XINTIAN MA SHENGJUN QIAO BAOMIN |
description | The invention relates to the technical field of discretization of continuous attributes of big data, in particular to an unsupervised discretization method for continuous attribute data based on information entropy. The method includes the steps as follows: the first step of traversing all value records of any continuous attribute, counting discrete granularity |nj| of the attribute and the probability qji of each different value, and recording a maximum njmax and minimum njmin; the second step of obtaining a calculating formula of the value chaos degree of any continuous attribute nj according to a calculating formula of the information entropy, and calculating the value chaos degree of the attribute according to the formula; the third step of rounding down the value chaos degree to obtain the number of break points; the fourth step of adopting an equivalent width interval method to calculate the width of each divided interval, and determining the position of each break point; and thefifth step of discretizi |
format | Patent |
fullrecord | <record><control><sourceid>epo_EVB</sourceid><recordid>TN_cdi_epo_espacenet_CN108073553A</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>CN108073553A</sourcerecordid><originalsourceid>FETCH-epo_espacenet_CN108073553A3</originalsourceid><addsrcrecordid>eNqNyzEKwkAQQNE0FqLeYTyAEFmCthIUKystJUyyExwwM8vurKCnN6IHsPrN-9PiepGUA8UHJ_LgOXWRjF9orAID2U099BqhUzGWrDkBmkVusxF4NIQWP-eoWUY4fE8Sixqe82LS4z3R4tdZsTzsz_VxRUEbSgE7ErKmPq3LbblxVeV27h_zBsQHPZ0</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>patent</recordtype></control><display><type>patent</type><title>Unsupervised discretization method for continuous attribute data based on information entropy</title><source>esp@cenet</source><creator>CHEN WANGHU ; GUO HONGLE ; LI XINTIAN ; MA SHENGJUN ; QIAO BAOMIN</creator><creatorcontrib>CHEN WANGHU ; GUO HONGLE ; LI XINTIAN ; MA SHENGJUN ; QIAO BAOMIN</creatorcontrib><description>The invention relates to the technical field of discretization of continuous attributes of big data, in particular to an unsupervised discretization method for continuous attribute data based on information entropy. The method includes the steps as follows: the first step of traversing all value records of any continuous attribute, counting discrete granularity |nj| of the attribute and the probability qji of each different value, and recording a maximum njmax and minimum njmin; the second step of obtaining a calculating formula of the value chaos degree of any continuous attribute nj according to a calculating formula of the information entropy, and calculating the value chaos degree of the attribute according to the formula; the third step of rounding down the value chaos degree to obtain the number of break points; the fourth step of adopting an equivalent width interval method to calculate the width of each divided interval, and determining the position of each break point; and thefifth step of discretizi</description><language>chi ; eng</language><subject>CALCULATING ; COMPUTING ; COUNTING ; ELECTRIC DIGITAL DATA PROCESSING ; PHYSICS</subject><creationdate>2018</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20180525&DB=EPODOC&CC=CN&NR=108073553A$$EHTML$$P50$$Gepo$$Hfree_for_read</linktohtml><link.rule.ids>230,308,780,885,25563,76318</link.rule.ids><linktorsrc>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20180525&DB=EPODOC&CC=CN&NR=108073553A$$EView_record_in_European_Patent_Office$$FView_record_in_$$GEuropean_Patent_Office$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>CHEN WANGHU</creatorcontrib><creatorcontrib>GUO HONGLE</creatorcontrib><creatorcontrib>LI XINTIAN</creatorcontrib><creatorcontrib>MA SHENGJUN</creatorcontrib><creatorcontrib>QIAO BAOMIN</creatorcontrib><title>Unsupervised discretization method for continuous attribute data based on information entropy</title><description>The invention relates to the technical field of discretization of continuous attributes of big data, in particular to an unsupervised discretization method for continuous attribute data based on information entropy. The method includes the steps as follows: the first step of traversing all value records of any continuous attribute, counting discrete granularity |nj| of the attribute and the probability qji of each different value, and recording a maximum njmax and minimum njmin; the second step of obtaining a calculating formula of the value chaos degree of any continuous attribute nj according to a calculating formula of the information entropy, and calculating the value chaos degree of the attribute according to the formula; the third step of rounding down the value chaos degree to obtain the number of break points; the fourth step of adopting an equivalent width interval method to calculate the width of each divided interval, and determining the position of each break point; and thefifth step of discretizi</description><subject>CALCULATING</subject><subject>COMPUTING</subject><subject>COUNTING</subject><subject>ELECTRIC DIGITAL DATA PROCESSING</subject><subject>PHYSICS</subject><fulltext>true</fulltext><rsrctype>patent</rsrctype><creationdate>2018</creationdate><recordtype>patent</recordtype><sourceid>EVB</sourceid><recordid>eNqNyzEKwkAQQNE0FqLeYTyAEFmCthIUKystJUyyExwwM8vurKCnN6IHsPrN-9PiepGUA8UHJ_LgOXWRjF9orAID2U099BqhUzGWrDkBmkVusxF4NIQWP-eoWUY4fE8Sixqe82LS4z3R4tdZsTzsz_VxRUEbSgE7ErKmPq3LbblxVeV27h_zBsQHPZ0</recordid><startdate>20180525</startdate><enddate>20180525</enddate><creator>CHEN WANGHU</creator><creator>GUO HONGLE</creator><creator>LI XINTIAN</creator><creator>MA SHENGJUN</creator><creator>QIAO BAOMIN</creator><scope>EVB</scope></search><sort><creationdate>20180525</creationdate><title>Unsupervised discretization method for continuous attribute data based on information entropy</title><author>CHEN WANGHU ; GUO HONGLE ; LI XINTIAN ; MA SHENGJUN ; QIAO BAOMIN</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-epo_espacenet_CN108073553A3</frbrgroupid><rsrctype>patents</rsrctype><prefilter>patents</prefilter><language>chi ; eng</language><creationdate>2018</creationdate><topic>CALCULATING</topic><topic>COMPUTING</topic><topic>COUNTING</topic><topic>ELECTRIC DIGITAL DATA PROCESSING</topic><topic>PHYSICS</topic><toplevel>online_resources</toplevel><creatorcontrib>CHEN WANGHU</creatorcontrib><creatorcontrib>GUO HONGLE</creatorcontrib><creatorcontrib>LI XINTIAN</creatorcontrib><creatorcontrib>MA SHENGJUN</creatorcontrib><creatorcontrib>QIAO BAOMIN</creatorcontrib><collection>esp@cenet</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>CHEN WANGHU</au><au>GUO HONGLE</au><au>LI XINTIAN</au><au>MA SHENGJUN</au><au>QIAO BAOMIN</au><format>patent</format><genre>patent</genre><ristype>GEN</ristype><title>Unsupervised discretization method for continuous attribute data based on information entropy</title><date>2018-05-25</date><risdate>2018</risdate><abstract>The invention relates to the technical field of discretization of continuous attributes of big data, in particular to an unsupervised discretization method for continuous attribute data based on information entropy. The method includes the steps as follows: the first step of traversing all value records of any continuous attribute, counting discrete granularity |nj| of the attribute and the probability qji of each different value, and recording a maximum njmax and minimum njmin; the second step of obtaining a calculating formula of the value chaos degree of any continuous attribute nj according to a calculating formula of the information entropy, and calculating the value chaos degree of the attribute according to the formula; the third step of rounding down the value chaos degree to obtain the number of break points; the fourth step of adopting an equivalent width interval method to calculate the width of each divided interval, and determining the position of each break point; and thefifth step of discretizi</abstract><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | |
ispartof | |
issn | |
language | chi ; eng |
recordid | cdi_epo_espacenet_CN108073553A |
source | esp@cenet |
subjects | CALCULATING COMPUTING COUNTING ELECTRIC DIGITAL DATA PROCESSING PHYSICS |
title | Unsupervised discretization method for continuous attribute data based on information entropy |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-11T21%3A32%3A01IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-epo_EVB&rft_val_fmt=info:ofi/fmt:kev:mtx:patent&rft.genre=patent&rft.au=CHEN%20WANGHU&rft.date=2018-05-25&rft_id=info:doi/&rft_dat=%3Cepo_EVB%3ECN108073553A%3C/epo_EVB%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |