Text risk identification method and device
The invention relates to a text risk identification method and device, and relates to the technical field of computers. According to the method, vectorization representation of text semantics can be firstly carried out on a to-be-queried text to obtain a first initial feature vector, then whitening...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Patent |
Sprache: | chi ; eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | WANG YUBIN DONG QIJIANG QI SHUMEI |
description | The invention relates to a text risk identification method and device, and relates to the technical field of computers. According to the method, vectorization representation of text semantics can be firstly carried out on a to-be-queried text to obtain a first initial feature vector, then whitening processing is carried out to obtain a target feature vector, similarity matching is carried out in a sample feature vector based on the target feature vector, and a risk identification result of the to-be-queried text is determined according to a matching result; wherein the sample feature vector is obtained by performing vectorization representation of text semantics on the risk text and performing whitening processing. According to the method, whitening processing is adopted to reduce the influence of word frequency on semantic representation, and similarity comparison is unsupervised risk identification, so that the sample labeling cost is eliminated, and the accuracy of similarity matching is improved; the accu |
format | Patent |
fullrecord | <record><control><sourceid>epo_EVB</sourceid><recordid>TN_cdi_epo_espacenet_CN116662540A</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>CN116662540A</sourcerecordid><originalsourceid>FETCH-epo_espacenet_CN116662540A3</originalsourceid><addsrcrecordid>eNrjZNAKSa0oUSjKLM5WyExJzSvJTMtMTizJzM9TyE0tychPUUjMS1FISS3LTE7lYWBNS8wpTuWF0twMim6uIc4euqkF-fGpxQWJyal5qSXxzn6GhmZmZkamJgaOxsSoAQAvLij7</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>patent</recordtype></control><display><type>patent</type><title>Text risk identification method and device</title><source>esp@cenet</source><creator>WANG YUBIN ; DONG QIJIANG ; QI SHUMEI</creator><creatorcontrib>WANG YUBIN ; DONG QIJIANG ; QI SHUMEI</creatorcontrib><description>The invention relates to a text risk identification method and device, and relates to the technical field of computers. According to the method, vectorization representation of text semantics can be firstly carried out on a to-be-queried text to obtain a first initial feature vector, then whitening processing is carried out to obtain a target feature vector, similarity matching is carried out in a sample feature vector based on the target feature vector, and a risk identification result of the to-be-queried text is determined according to a matching result; wherein the sample feature vector is obtained by performing vectorization representation of text semantics on the risk text and performing whitening processing. According to the method, whitening processing is adopted to reduce the influence of word frequency on semantic representation, and similarity comparison is unsupervised risk identification, so that the sample labeling cost is eliminated, and the accuracy of similarity matching is improved; the accu</description><language>chi ; eng</language><subject>CALCULATING ; COMPUTING ; COUNTING ; ELECTRIC DIGITAL DATA PROCESSING ; PHYSICS</subject><creationdate>2023</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20230829&DB=EPODOC&CC=CN&NR=116662540A$$EHTML$$P50$$Gepo$$Hfree_for_read</linktohtml><link.rule.ids>230,308,780,885,25564,76547</link.rule.ids><linktorsrc>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20230829&DB=EPODOC&CC=CN&NR=116662540A$$EView_record_in_European_Patent_Office$$FView_record_in_$$GEuropean_Patent_Office$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>WANG YUBIN</creatorcontrib><creatorcontrib>DONG QIJIANG</creatorcontrib><creatorcontrib>QI SHUMEI</creatorcontrib><title>Text risk identification method and device</title><description>The invention relates to a text risk identification method and device, and relates to the technical field of computers. According to the method, vectorization representation of text semantics can be firstly carried out on a to-be-queried text to obtain a first initial feature vector, then whitening processing is carried out to obtain a target feature vector, similarity matching is carried out in a sample feature vector based on the target feature vector, and a risk identification result of the to-be-queried text is determined according to a matching result; wherein the sample feature vector is obtained by performing vectorization representation of text semantics on the risk text and performing whitening processing. According to the method, whitening processing is adopted to reduce the influence of word frequency on semantic representation, and similarity comparison is unsupervised risk identification, so that the sample labeling cost is eliminated, and the accuracy of similarity matching is improved; the accu</description><subject>CALCULATING</subject><subject>COMPUTING</subject><subject>COUNTING</subject><subject>ELECTRIC DIGITAL DATA PROCESSING</subject><subject>PHYSICS</subject><fulltext>true</fulltext><rsrctype>patent</rsrctype><creationdate>2023</creationdate><recordtype>patent</recordtype><sourceid>EVB</sourceid><recordid>eNrjZNAKSa0oUSjKLM5WyExJzSvJTMtMTizJzM9TyE0tychPUUjMS1FISS3LTE7lYWBNS8wpTuWF0twMim6uIc4euqkF-fGpxQWJyal5qSXxzn6GhmZmZkamJgaOxsSoAQAvLij7</recordid><startdate>20230829</startdate><enddate>20230829</enddate><creator>WANG YUBIN</creator><creator>DONG QIJIANG</creator><creator>QI SHUMEI</creator><scope>EVB</scope></search><sort><creationdate>20230829</creationdate><title>Text risk identification method and device</title><author>WANG YUBIN ; DONG QIJIANG ; QI SHUMEI</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-epo_espacenet_CN116662540A3</frbrgroupid><rsrctype>patents</rsrctype><prefilter>patents</prefilter><language>chi ; eng</language><creationdate>2023</creationdate><topic>CALCULATING</topic><topic>COMPUTING</topic><topic>COUNTING</topic><topic>ELECTRIC DIGITAL DATA PROCESSING</topic><topic>PHYSICS</topic><toplevel>online_resources</toplevel><creatorcontrib>WANG YUBIN</creatorcontrib><creatorcontrib>DONG QIJIANG</creatorcontrib><creatorcontrib>QI SHUMEI</creatorcontrib><collection>esp@cenet</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>WANG YUBIN</au><au>DONG QIJIANG</au><au>QI SHUMEI</au><format>patent</format><genre>patent</genre><ristype>GEN</ristype><title>Text risk identification method and device</title><date>2023-08-29</date><risdate>2023</risdate><abstract>The invention relates to a text risk identification method and device, and relates to the technical field of computers. According to the method, vectorization representation of text semantics can be firstly carried out on a to-be-queried text to obtain a first initial feature vector, then whitening processing is carried out to obtain a target feature vector, similarity matching is carried out in a sample feature vector based on the target feature vector, and a risk identification result of the to-be-queried text is determined according to a matching result; wherein the sample feature vector is obtained by performing vectorization representation of text semantics on the risk text and performing whitening processing. According to the method, whitening processing is adopted to reduce the influence of word frequency on semantic representation, and similarity comparison is unsupervised risk identification, so that the sample labeling cost is eliminated, and the accuracy of similarity matching is improved; the accu</abstract><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | |
ispartof | |
issn | |
language | chi ; eng |
recordid | cdi_epo_espacenet_CN116662540A |
source | esp@cenet |
subjects | CALCULATING COMPUTING COUNTING ELECTRIC DIGITAL DATA PROCESSING PHYSICS |
title | Text risk identification method and device |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-03T03%3A46%3A57IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-epo_EVB&rft_val_fmt=info:ofi/fmt:kev:mtx:patent&rft.genre=patent&rft.au=WANG%20YUBIN&rft.date=2023-08-29&rft_id=info:doi/&rft_dat=%3Cepo_EVB%3ECN116662540A%3C/epo_EVB%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |