Creating taxonomies and training data in multiple languages
The problem of creating of taxonomies of objects, particularly objects that can be represented as text in various languages, and categorizing such objects is addressed by a method for taking the training documents generated in a first language, translating it to a target language, and then generatin...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Patent |
Sprache: | chi ; eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | CHENG, KEH-SHIN FU GATES, STEPHEN C |
description | The problem of creating of taxonomies of objects, particularly objects that can be represented as text in various languages, and categorizing such objects is addressed by a method for taking the training documents generated in a first language, translating it to a target language, and then generating from a plurality of training documents one or more sets of features representing one or more categories in the target language. The method includes the steps of: forming a first list of items such that each item in the first list represents a particular training document having an association with one or more elements related to a particular category; developing a second list from the first list by deleting one or more candidate documents which satisfy at least one deletion criterion; translating the documents in the second list from the source language to the target language, and extracting the one or more sets of features from the translated second list using one or more feature selection criteria. |
format | Patent |
fullrecord | <record><control><sourceid>epo_EVB</sourceid><recordid>TN_cdi_epo_espacenet_TW200519645A</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>TW200519645A</sourcerecordid><originalsourceid>FETCH-epo_espacenet_TW200519645A3</originalsourceid><addsrcrecordid>eNrjZLB2LkpNLMnMS1coSazIz8vPzUwtVkjMS1EoKUrMzAOJpySWJCpk5inkluaUZBbkpCrkJOallyampxbzMLCmJeYUp_JCaW4GRTfXEGcP3dSC_PjU4oLE5NS81JL4kHAjAwNTQ0szE1NHY2LUAADK8y_N</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>patent</recordtype></control><display><type>patent</type><title>Creating taxonomies and training data in multiple languages</title><source>esp@cenet</source><creator>CHENG, KEH-SHIN FU ; GATES, STEPHEN C</creator><creatorcontrib>CHENG, KEH-SHIN FU ; GATES, STEPHEN C</creatorcontrib><description>The problem of creating of taxonomies of objects, particularly objects that can be represented as text in various languages, and categorizing such objects is addressed by a method for taking the training documents generated in a first language, translating it to a target language, and then generating from a plurality of training documents one or more sets of features representing one or more categories in the target language. The method includes the steps of: forming a first list of items such that each item in the first list represents a particular training document having an association with one or more elements related to a particular category; developing a second list from the first list by deleting one or more candidate documents which satisfy at least one deletion criterion; translating the documents in the second list from the source language to the target language, and extracting the one or more sets of features from the translated second list using one or more feature selection criteria.</description><language>chi ; eng</language><subject>CALCULATING ; COMPUTING ; COUNTING ; ELECTRIC DIGITAL DATA PROCESSING ; PHYSICS</subject><creationdate>2005</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20050616&DB=EPODOC&CC=TW&NR=200519645A$$EHTML$$P50$$Gepo$$Hfree_for_read</linktohtml><link.rule.ids>230,308,780,885,25564,76547</link.rule.ids><linktorsrc>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20050616&DB=EPODOC&CC=TW&NR=200519645A$$EView_record_in_European_Patent_Office$$FView_record_in_$$GEuropean_Patent_Office$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>CHENG, KEH-SHIN FU</creatorcontrib><creatorcontrib>GATES, STEPHEN C</creatorcontrib><title>Creating taxonomies and training data in multiple languages</title><description>The problem of creating of taxonomies of objects, particularly objects that can be represented as text in various languages, and categorizing such objects is addressed by a method for taking the training documents generated in a first language, translating it to a target language, and then generating from a plurality of training documents one or more sets of features representing one or more categories in the target language. The method includes the steps of: forming a first list of items such that each item in the first list represents a particular training document having an association with one or more elements related to a particular category; developing a second list from the first list by deleting one or more candidate documents which satisfy at least one deletion criterion; translating the documents in the second list from the source language to the target language, and extracting the one or more sets of features from the translated second list using one or more feature selection criteria.</description><subject>CALCULATING</subject><subject>COMPUTING</subject><subject>COUNTING</subject><subject>ELECTRIC DIGITAL DATA PROCESSING</subject><subject>PHYSICS</subject><fulltext>true</fulltext><rsrctype>patent</rsrctype><creationdate>2005</creationdate><recordtype>patent</recordtype><sourceid>EVB</sourceid><recordid>eNrjZLB2LkpNLMnMS1coSazIz8vPzUwtVkjMS1EoKUrMzAOJpySWJCpk5inkluaUZBbkpCrkJOallyampxbzMLCmJeYUp_JCaW4GRTfXEGcP3dSC_PjU4oLE5NS81JL4kHAjAwNTQ0szE1NHY2LUAADK8y_N</recordid><startdate>20050616</startdate><enddate>20050616</enddate><creator>CHENG, KEH-SHIN FU</creator><creator>GATES, STEPHEN C</creator><scope>EVB</scope></search><sort><creationdate>20050616</creationdate><title>Creating taxonomies and training data in multiple languages</title><author>CHENG, KEH-SHIN FU ; GATES, STEPHEN C</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-epo_espacenet_TW200519645A3</frbrgroupid><rsrctype>patents</rsrctype><prefilter>patents</prefilter><language>chi ; eng</language><creationdate>2005</creationdate><topic>CALCULATING</topic><topic>COMPUTING</topic><topic>COUNTING</topic><topic>ELECTRIC DIGITAL DATA PROCESSING</topic><topic>PHYSICS</topic><toplevel>online_resources</toplevel><creatorcontrib>CHENG, KEH-SHIN FU</creatorcontrib><creatorcontrib>GATES, STEPHEN C</creatorcontrib><collection>esp@cenet</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>CHENG, KEH-SHIN FU</au><au>GATES, STEPHEN C</au><format>patent</format><genre>patent</genre><ristype>GEN</ristype><title>Creating taxonomies and training data in multiple languages</title><date>2005-06-16</date><risdate>2005</risdate><abstract>The problem of creating of taxonomies of objects, particularly objects that can be represented as text in various languages, and categorizing such objects is addressed by a method for taking the training documents generated in a first language, translating it to a target language, and then generating from a plurality of training documents one or more sets of features representing one or more categories in the target language. The method includes the steps of: forming a first list of items such that each item in the first list represents a particular training document having an association with one or more elements related to a particular category; developing a second list from the first list by deleting one or more candidate documents which satisfy at least one deletion criterion; translating the documents in the second list from the source language to the target language, and extracting the one or more sets of features from the translated second list using one or more feature selection criteria.</abstract><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | |
ispartof | |
issn | |
language | chi ; eng |
recordid | cdi_epo_espacenet_TW200519645A |
source | esp@cenet |
subjects | CALCULATING COMPUTING COUNTING ELECTRIC DIGITAL DATA PROCESSING PHYSICS |
title | Creating taxonomies and training data in multiple languages |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-26T17%3A38%3A56IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-epo_EVB&rft_val_fmt=info:ofi/fmt:kev:mtx:patent&rft.genre=patent&rft.au=CHENG,%20KEH-SHIN%20FU&rft.date=2005-06-16&rft_id=info:doi/&rft_dat=%3Cepo_EVB%3ETW200519645A%3C/epo_EVB%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |