Text risk identification method and device

The invention relates to a text risk identification method and device, and relates to the technical field of computers. According to the method, vectorization representation of text semantics can be firstly carried out on a to-be-queried text to obtain a first initial feature vector, then whitening...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: WANG YUBIN, DONG QIJIANG, QI SHUMEI
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator WANG YUBIN
DONG QIJIANG
QI SHUMEI
description The invention relates to a text risk identification method and device, and relates to the technical field of computers. According to the method, vectorization representation of text semantics can be firstly carried out on a to-be-queried text to obtain a first initial feature vector, then whitening processing is carried out to obtain a target feature vector, similarity matching is carried out in a sample feature vector based on the target feature vector, and a risk identification result of the to-be-queried text is determined according to a matching result; wherein the sample feature vector is obtained by performing vectorization representation of text semantics on the risk text and performing whitening processing. According to the method, whitening processing is adopted to reduce the influence of word frequency on semantic representation, and similarity comparison is unsupervised risk identification, so that the sample labeling cost is eliminated, and the accuracy of similarity matching is improved; the accu
format Patent
fullrecord <record><control><sourceid>epo_EVB</sourceid><recordid>TN_cdi_epo_espacenet_CN116662540A</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>CN116662540A</sourcerecordid><originalsourceid>FETCH-epo_espacenet_CN116662540A3</originalsourceid><addsrcrecordid>eNrjZNAKSa0oUSjKLM5WyExJzSvJTMtMTizJzM9TyE0tychPUUjMS1FISS3LTE7lYWBNS8wpTuWF0twMim6uIc4euqkF-fGpxQWJyal5qSXxzn6GhmZmZkamJgaOxsSoAQAvLij7</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>patent</recordtype></control><display><type>patent</type><title>Text risk identification method and device</title><source>esp@cenet</source><creator>WANG YUBIN ; DONG QIJIANG ; QI SHUMEI</creator><creatorcontrib>WANG YUBIN ; DONG QIJIANG ; QI SHUMEI</creatorcontrib><description>The invention relates to a text risk identification method and device, and relates to the technical field of computers. According to the method, vectorization representation of text semantics can be firstly carried out on a to-be-queried text to obtain a first initial feature vector, then whitening processing is carried out to obtain a target feature vector, similarity matching is carried out in a sample feature vector based on the target feature vector, and a risk identification result of the to-be-queried text is determined according to a matching result; wherein the sample feature vector is obtained by performing vectorization representation of text semantics on the risk text and performing whitening processing. According to the method, whitening processing is adopted to reduce the influence of word frequency on semantic representation, and similarity comparison is unsupervised risk identification, so that the sample labeling cost is eliminated, and the accuracy of similarity matching is improved; the accu</description><language>chi ; eng</language><subject>CALCULATING ; COMPUTING ; COUNTING ; ELECTRIC DIGITAL DATA PROCESSING ; PHYSICS</subject><creationdate>2023</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&amp;date=20230829&amp;DB=EPODOC&amp;CC=CN&amp;NR=116662540A$$EHTML$$P50$$Gepo$$Hfree_for_read</linktohtml><link.rule.ids>230,308,780,885,25564,76547</link.rule.ids><linktorsrc>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&amp;date=20230829&amp;DB=EPODOC&amp;CC=CN&amp;NR=116662540A$$EView_record_in_European_Patent_Office$$FView_record_in_$$GEuropean_Patent_Office$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>WANG YUBIN</creatorcontrib><creatorcontrib>DONG QIJIANG</creatorcontrib><creatorcontrib>QI SHUMEI</creatorcontrib><title>Text risk identification method and device</title><description>The invention relates to a text risk identification method and device, and relates to the technical field of computers. According to the method, vectorization representation of text semantics can be firstly carried out on a to-be-queried text to obtain a first initial feature vector, then whitening processing is carried out to obtain a target feature vector, similarity matching is carried out in a sample feature vector based on the target feature vector, and a risk identification result of the to-be-queried text is determined according to a matching result; wherein the sample feature vector is obtained by performing vectorization representation of text semantics on the risk text and performing whitening processing. According to the method, whitening processing is adopted to reduce the influence of word frequency on semantic representation, and similarity comparison is unsupervised risk identification, so that the sample labeling cost is eliminated, and the accuracy of similarity matching is improved; the accu</description><subject>CALCULATING</subject><subject>COMPUTING</subject><subject>COUNTING</subject><subject>ELECTRIC DIGITAL DATA PROCESSING</subject><subject>PHYSICS</subject><fulltext>true</fulltext><rsrctype>patent</rsrctype><creationdate>2023</creationdate><recordtype>patent</recordtype><sourceid>EVB</sourceid><recordid>eNrjZNAKSa0oUSjKLM5WyExJzSvJTMtMTizJzM9TyE0tychPUUjMS1FISS3LTE7lYWBNS8wpTuWF0twMim6uIc4euqkF-fGpxQWJyal5qSXxzn6GhmZmZkamJgaOxsSoAQAvLij7</recordid><startdate>20230829</startdate><enddate>20230829</enddate><creator>WANG YUBIN</creator><creator>DONG QIJIANG</creator><creator>QI SHUMEI</creator><scope>EVB</scope></search><sort><creationdate>20230829</creationdate><title>Text risk identification method and device</title><author>WANG YUBIN ; DONG QIJIANG ; QI SHUMEI</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-epo_espacenet_CN116662540A3</frbrgroupid><rsrctype>patents</rsrctype><prefilter>patents</prefilter><language>chi ; eng</language><creationdate>2023</creationdate><topic>CALCULATING</topic><topic>COMPUTING</topic><topic>COUNTING</topic><topic>ELECTRIC DIGITAL DATA PROCESSING</topic><topic>PHYSICS</topic><toplevel>online_resources</toplevel><creatorcontrib>WANG YUBIN</creatorcontrib><creatorcontrib>DONG QIJIANG</creatorcontrib><creatorcontrib>QI SHUMEI</creatorcontrib><collection>esp@cenet</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>WANG YUBIN</au><au>DONG QIJIANG</au><au>QI SHUMEI</au><format>patent</format><genre>patent</genre><ristype>GEN</ristype><title>Text risk identification method and device</title><date>2023-08-29</date><risdate>2023</risdate><abstract>The invention relates to a text risk identification method and device, and relates to the technical field of computers. According to the method, vectorization representation of text semantics can be firstly carried out on a to-be-queried text to obtain a first initial feature vector, then whitening processing is carried out to obtain a target feature vector, similarity matching is carried out in a sample feature vector based on the target feature vector, and a risk identification result of the to-be-queried text is determined according to a matching result; wherein the sample feature vector is obtained by performing vectorization representation of text semantics on the risk text and performing whitening processing. According to the method, whitening processing is adopted to reduce the influence of word frequency on semantic representation, and similarity comparison is unsupervised risk identification, so that the sample labeling cost is eliminated, and the accuracy of similarity matching is improved; the accu</abstract><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier
ispartof
issn
language chi ; eng
recordid cdi_epo_espacenet_CN116662540A
source esp@cenet
subjects CALCULATING
COMPUTING
COUNTING
ELECTRIC DIGITAL DATA PROCESSING
PHYSICS
title Text risk identification method and device
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-03T03%3A46%3A57IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-epo_EVB&rft_val_fmt=info:ofi/fmt:kev:mtx:patent&rft.genre=patent&rft.au=WANG%20YUBIN&rft.date=2023-08-29&rft_id=info:doi/&rft_dat=%3Cepo_EVB%3ECN116662540A%3C/epo_EVB%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true