Rapid text classification method and device

The invention provides a rapid text classification method and device based on a topic model in combination with linear discrimination. A subject model based on word bag and word frequency vector + PCA + linear discrimination + similarity calculation is combined with linear discrimination to quickly...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: XIANG RONGXIN, LIU XIN, WANG LIECHONG, LI DICHENG, HUANG WEI, ZHAO QINGQI
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator XIANG RONGXIN
LIU XIN
WANG LIECHONG
LI DICHENG
HUANG WEI
ZHAO QINGQI
description The invention provides a rapid text classification method and device based on a topic model in combination with linear discrimination. A subject model based on word bag and word frequency vector + PCA + linear discrimination + similarity calculation is combined with linear discrimination to quickly and accurately discover a handling department to which new appeal data belongs. According to the method, data preprocessing is mainly carried out based on obtained historical appeal data, and the method mainly comprises vacancy value cleaning and data standardization operation. Comprising the following steps: grouping data according to different handling departments to which standardized data belongs in an actual situation; performing feature word extraction on the grouped data of the handling departments by adopting jieba word segmentation; constructing a bag-of-word and word frequency vector by applying a statistical method; training the data of each department and the overall data by using a PCA method based on
format Patent
fullrecord <record><control><sourceid>epo_EVB</sourceid><recordid>TN_cdi_epo_espacenet_CN113672725A</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>CN113672725A</sourcerecordid><originalsourceid>FETCH-epo_espacenet_CN113672725A3</originalsourceid><addsrcrecordid>eNrjZNAOSizITFEoSa0oUUjOSSwuzkzLTE4syczPU8hNLcnIT1FIzEtRSEkty0xO5WFgTUvMKU7lhdLcDIpuriHOHrqpBfnxqcUFicmpeakl8c5-hobGZuZG5kamjsbEqAEAXZcpWw</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>patent</recordtype></control><display><type>patent</type><title>Rapid text classification method and device</title><source>esp@cenet</source><creator>XIANG RONGXIN ; LIU XIN ; WANG LIECHONG ; LI DICHENG ; HUANG WEI ; ZHAO QINGQI</creator><creatorcontrib>XIANG RONGXIN ; LIU XIN ; WANG LIECHONG ; LI DICHENG ; HUANG WEI ; ZHAO QINGQI</creatorcontrib><description>The invention provides a rapid text classification method and device based on a topic model in combination with linear discrimination. A subject model based on word bag and word frequency vector + PCA + linear discrimination + similarity calculation is combined with linear discrimination to quickly and accurately discover a handling department to which new appeal data belongs. According to the method, data preprocessing is mainly carried out based on obtained historical appeal data, and the method mainly comprises vacancy value cleaning and data standardization operation. Comprising the following steps: grouping data according to different handling departments to which standardized data belongs in an actual situation; performing feature word extraction on the grouped data of the handling departments by adopting jieba word segmentation; constructing a bag-of-word and word frequency vector by applying a statistical method; training the data of each department and the overall data by using a PCA method based on</description><language>chi ; eng</language><subject>CALCULATING ; COMPUTING ; COUNTING ; ELECTRIC DIGITAL DATA PROCESSING ; HANDLING RECORD CARRIERS ; PHYSICS ; PRESENTATION OF DATA ; RECOGNITION OF DATA ; RECORD CARRIERS</subject><creationdate>2021</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&amp;date=20211119&amp;DB=EPODOC&amp;CC=CN&amp;NR=113672725A$$EHTML$$P50$$Gepo$$Hfree_for_read</linktohtml><link.rule.ids>230,308,776,881,25542,76290</link.rule.ids><linktorsrc>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&amp;date=20211119&amp;DB=EPODOC&amp;CC=CN&amp;NR=113672725A$$EView_record_in_European_Patent_Office$$FView_record_in_$$GEuropean_Patent_Office$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>XIANG RONGXIN</creatorcontrib><creatorcontrib>LIU XIN</creatorcontrib><creatorcontrib>WANG LIECHONG</creatorcontrib><creatorcontrib>LI DICHENG</creatorcontrib><creatorcontrib>HUANG WEI</creatorcontrib><creatorcontrib>ZHAO QINGQI</creatorcontrib><title>Rapid text classification method and device</title><description>The invention provides a rapid text classification method and device based on a topic model in combination with linear discrimination. A subject model based on word bag and word frequency vector + PCA + linear discrimination + similarity calculation is combined with linear discrimination to quickly and accurately discover a handling department to which new appeal data belongs. According to the method, data preprocessing is mainly carried out based on obtained historical appeal data, and the method mainly comprises vacancy value cleaning and data standardization operation. Comprising the following steps: grouping data according to different handling departments to which standardized data belongs in an actual situation; performing feature word extraction on the grouped data of the handling departments by adopting jieba word segmentation; constructing a bag-of-word and word frequency vector by applying a statistical method; training the data of each department and the overall data by using a PCA method based on</description><subject>CALCULATING</subject><subject>COMPUTING</subject><subject>COUNTING</subject><subject>ELECTRIC DIGITAL DATA PROCESSING</subject><subject>HANDLING RECORD CARRIERS</subject><subject>PHYSICS</subject><subject>PRESENTATION OF DATA</subject><subject>RECOGNITION OF DATA</subject><subject>RECORD CARRIERS</subject><fulltext>true</fulltext><rsrctype>patent</rsrctype><creationdate>2021</creationdate><recordtype>patent</recordtype><sourceid>EVB</sourceid><recordid>eNrjZNAOSizITFEoSa0oUUjOSSwuzkzLTE4syczPU8hNLcnIT1FIzEtRSEkty0xO5WFgTUvMKU7lhdLcDIpuriHOHrqpBfnxqcUFicmpeakl8c5-hobGZuZG5kamjsbEqAEAXZcpWw</recordid><startdate>20211119</startdate><enddate>20211119</enddate><creator>XIANG RONGXIN</creator><creator>LIU XIN</creator><creator>WANG LIECHONG</creator><creator>LI DICHENG</creator><creator>HUANG WEI</creator><creator>ZHAO QINGQI</creator><scope>EVB</scope></search><sort><creationdate>20211119</creationdate><title>Rapid text classification method and device</title><author>XIANG RONGXIN ; LIU XIN ; WANG LIECHONG ; LI DICHENG ; HUANG WEI ; ZHAO QINGQI</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-epo_espacenet_CN113672725A3</frbrgroupid><rsrctype>patents</rsrctype><prefilter>patents</prefilter><language>chi ; eng</language><creationdate>2021</creationdate><topic>CALCULATING</topic><topic>COMPUTING</topic><topic>COUNTING</topic><topic>ELECTRIC DIGITAL DATA PROCESSING</topic><topic>HANDLING RECORD CARRIERS</topic><topic>PHYSICS</topic><topic>PRESENTATION OF DATA</topic><topic>RECOGNITION OF DATA</topic><topic>RECORD CARRIERS</topic><toplevel>online_resources</toplevel><creatorcontrib>XIANG RONGXIN</creatorcontrib><creatorcontrib>LIU XIN</creatorcontrib><creatorcontrib>WANG LIECHONG</creatorcontrib><creatorcontrib>LI DICHENG</creatorcontrib><creatorcontrib>HUANG WEI</creatorcontrib><creatorcontrib>ZHAO QINGQI</creatorcontrib><collection>esp@cenet</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>XIANG RONGXIN</au><au>LIU XIN</au><au>WANG LIECHONG</au><au>LI DICHENG</au><au>HUANG WEI</au><au>ZHAO QINGQI</au><format>patent</format><genre>patent</genre><ristype>GEN</ristype><title>Rapid text classification method and device</title><date>2021-11-19</date><risdate>2021</risdate><abstract>The invention provides a rapid text classification method and device based on a topic model in combination with linear discrimination. A subject model based on word bag and word frequency vector + PCA + linear discrimination + similarity calculation is combined with linear discrimination to quickly and accurately discover a handling department to which new appeal data belongs. According to the method, data preprocessing is mainly carried out based on obtained historical appeal data, and the method mainly comprises vacancy value cleaning and data standardization operation. Comprising the following steps: grouping data according to different handling departments to which standardized data belongs in an actual situation; performing feature word extraction on the grouped data of the handling departments by adopting jieba word segmentation; constructing a bag-of-word and word frequency vector by applying a statistical method; training the data of each department and the overall data by using a PCA method based on</abstract><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier
ispartof
issn
language chi ; eng
recordid cdi_epo_espacenet_CN113672725A
source esp@cenet
subjects CALCULATING
COMPUTING
COUNTING
ELECTRIC DIGITAL DATA PROCESSING
HANDLING RECORD CARRIERS
PHYSICS
PRESENTATION OF DATA
RECOGNITION OF DATA
RECORD CARRIERS
title Rapid text classification method and device
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-03T07%3A36%3A32IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-epo_EVB&rft_val_fmt=info:ofi/fmt:kev:mtx:patent&rft.genre=patent&rft.au=XIANG%20RONGXIN&rft.date=2021-11-19&rft_id=info:doi/&rft_dat=%3Cepo_EVB%3ECN113672725A%3C/epo_EVB%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true