Making good choices of non-redundant n-gramwords

A new complete proposal to solve the problem of automatically selecting good and non redundant n-gram words as attributes for textual data is proposed. Generally, the use of n-gram words is required to improve the subjective interpretability of a text mining task, with n ges 2. In these cases, the n...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Moura, Maria Fernanda, Nogueira, Bruno Magalhaes, da Silva Conrado, Merley, dos Santos, Fabiano Fernandes, Rezende, Solange Oliveira
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 71
container_issue
container_start_page 64
container_title
container_volume
creator Moura, Maria Fernanda
Nogueira, Bruno Magalhaes
da Silva Conrado, Merley
dos Santos, Fabiano Fernandes
Rezende, Solange Oliveira
description A new complete proposal to solve the problem of automatically selecting good and non redundant n-gram words as attributes for textual data is proposed. Generally, the use of n-gram words is required to improve the subjective interpretability of a text mining task, with n ges 2. In these cases, the n-gram words are statistically generated and selected, which always implies in redundancy. The proposed method eliminates only the redundancies. This can be observed by the results of classifiers over the original and the non redundant data sets, because, there is not a decrease in the categorization effectiveness. Additionally, the method is useful for any kind of machine learning process applied to a text mining task.
doi_str_mv 10.1109/ICCITECHN.2008.4803111
format Conference Proceeding
fullrecord <record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_4803111</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>4803111</ieee_id><sourcerecordid>4803111</sourcerecordid><originalsourceid>FETCH-ieee_primary_48031113</originalsourceid><addsrcrecordid>eNp9jsEKgkAURSdCKMsvCGJ-QHvPGVPXYuSiVu5l0NGsnImZIvr7CoR23c3lcM_iErJGCBAh3RRZVpR5tj8GIUAS8AQYIk6IizzkPES2jaY_iGKHuF8xBYiBzYhn7Rk-4RFLMJoTOIhLrzraad3Q-qT7WlqqW6q08o1sHqoR6k6V3xkxPLVp7JI4rbha6Y29IKtdXmZ7v5dSVjfTD8K8qvEW-7--ASZ0OAk</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Making good choices of non-redundant n-gramwords</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Moura, Maria Fernanda ; Nogueira, Bruno Magalhaes ; da Silva Conrado, Merley ; dos Santos, Fabiano Fernandes ; Rezende, Solange Oliveira</creator><creatorcontrib>Moura, Maria Fernanda ; Nogueira, Bruno Magalhaes ; da Silva Conrado, Merley ; dos Santos, Fabiano Fernandes ; Rezende, Solange Oliveira</creatorcontrib><description>A new complete proposal to solve the problem of automatically selecting good and non redundant n-gram words as attributes for textual data is proposed. Generally, the use of n-gram words is required to improve the subjective interpretability of a text mining task, with n ges 2. In these cases, the n-gram words are statistically generated and selected, which always implies in redundancy. The proposed method eliminates only the redundancies. This can be observed by the results of classifiers over the original and the non redundant data sets, because, there is not a decrease in the categorization effectiveness. Additionally, the method is useful for any kind of machine learning process applied to a text mining task.</description><identifier>ISBN: 1424421357</identifier><identifier>ISBN: 9781424421350</identifier><identifier>EISBN: 1424421365</identifier><identifier>EISBN: 9781424421367</identifier><identifier>DOI: 10.1109/ICCITECHN.2008.4803111</identifier><identifier>LCCN: 2008900703</identifier><language>eng</language><publisher>IEEE</publisher><subject>Artificial intelligence ; Data mining ; Decision making ; Frequency estimation ; Machine learning ; Manuals ; Mathematics ; Proposals ; Supervised learning ; Text mining</subject><ispartof>2008 11th International Conference on Computer and Information Technology, 2008, p.64-71</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/4803111$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>310,311,781,785,790,791,2059,27930,54925</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/4803111$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Moura, Maria Fernanda</creatorcontrib><creatorcontrib>Nogueira, Bruno Magalhaes</creatorcontrib><creatorcontrib>da Silva Conrado, Merley</creatorcontrib><creatorcontrib>dos Santos, Fabiano Fernandes</creatorcontrib><creatorcontrib>Rezende, Solange Oliveira</creatorcontrib><title>Making good choices of non-redundant n-gramwords</title><title>2008 11th International Conference on Computer and Information Technology</title><addtitle>ICCITECHN</addtitle><description>A new complete proposal to solve the problem of automatically selecting good and non redundant n-gram words as attributes for textual data is proposed. Generally, the use of n-gram words is required to improve the subjective interpretability of a text mining task, with n ges 2. In these cases, the n-gram words are statistically generated and selected, which always implies in redundancy. The proposed method eliminates only the redundancies. This can be observed by the results of classifiers over the original and the non redundant data sets, because, there is not a decrease in the categorization effectiveness. Additionally, the method is useful for any kind of machine learning process applied to a text mining task.</description><subject>Artificial intelligence</subject><subject>Data mining</subject><subject>Decision making</subject><subject>Frequency estimation</subject><subject>Machine learning</subject><subject>Manuals</subject><subject>Mathematics</subject><subject>Proposals</subject><subject>Supervised learning</subject><subject>Text mining</subject><isbn>1424421357</isbn><isbn>9781424421350</isbn><isbn>1424421365</isbn><isbn>9781424421367</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2008</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNp9jsEKgkAURSdCKMsvCGJ-QHvPGVPXYuSiVu5l0NGsnImZIvr7CoR23c3lcM_iErJGCBAh3RRZVpR5tj8GIUAS8AQYIk6IizzkPES2jaY_iGKHuF8xBYiBzYhn7Rk-4RFLMJoTOIhLrzraad3Q-qT7WlqqW6q08o1sHqoR6k6V3xkxPLVp7JI4rbha6Y29IKtdXmZ7v5dSVjfTD8K8qvEW-7--ASZ0OAk</recordid><startdate>200812</startdate><enddate>200812</enddate><creator>Moura, Maria Fernanda</creator><creator>Nogueira, Bruno Magalhaes</creator><creator>da Silva Conrado, Merley</creator><creator>dos Santos, Fabiano Fernandes</creator><creator>Rezende, Solange Oliveira</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>200812</creationdate><title>Making good choices of non-redundant n-gramwords</title><author>Moura, Maria Fernanda ; Nogueira, Bruno Magalhaes ; da Silva Conrado, Merley ; dos Santos, Fabiano Fernandes ; Rezende, Solange Oliveira</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-ieee_primary_48031113</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2008</creationdate><topic>Artificial intelligence</topic><topic>Data mining</topic><topic>Decision making</topic><topic>Frequency estimation</topic><topic>Machine learning</topic><topic>Manuals</topic><topic>Mathematics</topic><topic>Proposals</topic><topic>Supervised learning</topic><topic>Text mining</topic><toplevel>online_resources</toplevel><creatorcontrib>Moura, Maria Fernanda</creatorcontrib><creatorcontrib>Nogueira, Bruno Magalhaes</creatorcontrib><creatorcontrib>da Silva Conrado, Merley</creatorcontrib><creatorcontrib>dos Santos, Fabiano Fernandes</creatorcontrib><creatorcontrib>Rezende, Solange Oliveira</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Moura, Maria Fernanda</au><au>Nogueira, Bruno Magalhaes</au><au>da Silva Conrado, Merley</au><au>dos Santos, Fabiano Fernandes</au><au>Rezende, Solange Oliveira</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Making good choices of non-redundant n-gramwords</atitle><btitle>2008 11th International Conference on Computer and Information Technology</btitle><stitle>ICCITECHN</stitle><date>2008-12</date><risdate>2008</risdate><spage>64</spage><epage>71</epage><pages>64-71</pages><isbn>1424421357</isbn><isbn>9781424421350</isbn><eisbn>1424421365</eisbn><eisbn>9781424421367</eisbn><abstract>A new complete proposal to solve the problem of automatically selecting good and non redundant n-gram words as attributes for textual data is proposed. Generally, the use of n-gram words is required to improve the subjective interpretability of a text mining task, with n ges 2. In these cases, the n-gram words are statistically generated and selected, which always implies in redundancy. The proposed method eliminates only the redundancies. This can be observed by the results of classifiers over the original and the non redundant data sets, because, there is not a decrease in the categorization effectiveness. Additionally, the method is useful for any kind of machine learning process applied to a text mining task.</abstract><pub>IEEE</pub><doi>10.1109/ICCITECHN.2008.4803111</doi></addata></record>
fulltext fulltext_linktorsrc
identifier ISBN: 1424421357
ispartof 2008 11th International Conference on Computer and Information Technology, 2008, p.64-71
issn
language eng
recordid cdi_ieee_primary_4803111
source IEEE Electronic Library (IEL) Conference Proceedings
subjects Artificial intelligence
Data mining
Decision making
Frequency estimation
Machine learning
Manuals
Mathematics
Proposals
Supervised learning
Text mining
title Making good choices of non-redundant n-gramwords
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-15T16%3A59%3A15IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Making%20good%20choices%20of%20non-redundant%20n-gramwords&rft.btitle=2008%2011th%20International%20Conference%20on%20Computer%20and%20Information%20Technology&rft.au=Moura,%20Maria%20Fernanda&rft.date=2008-12&rft.spage=64&rft.epage=71&rft.pages=64-71&rft.isbn=1424421357&rft.isbn_list=9781424421350&rft_id=info:doi/10.1109/ICCITECHN.2008.4803111&rft_dat=%3Cieee_6IE%3E4803111%3C/ieee_6IE%3E%3Curl%3E%3C/url%3E&rft.eisbn=1424421365&rft.eisbn_list=9781424421367&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=4803111&rfr_iscdi=true