Text abstract automatic generation method based on XLNet

The invention discloses a text abstract automatic generation method based on XLNet, and mainly solves the problems of low sentence fluency and accuracy in the text abstract automatic generation process. The method comprises the following steps: obtaining paired text and abstract data, and constructi...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: ZHANG HUAIYU, LIU HONGYING, SHANG FANHUA, SHEN XIONGJIE, WANG ZHONGSHU, CHEN SUNHU
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator ZHANG HUAIYU
LIU HONGYING
SHANG FANHUA
SHEN XIONGJIE
WANG ZHONGSHU
CHEN SUNHU
description The invention discloses a text abstract automatic generation method based on XLNet, and mainly solves the problems of low sentence fluency and accuracy in the text abstract automatic generation process. The method comprises the following steps: obtaining paired text and abstract data, and constructing a training set; constructing a dictionary containing all common vocabularies and characters; building a backbone network by taking the pre-trained XLNet as an encoder and the Transformer-XL as a decoder; performing word segmentation on the text data in the training set and encoding the text datainto vectors to obtain network input, and finely tuning the network; and performing word segmentation and coding on the test text, and sending the test text to the trained network N to obtain an abstract result. The abstract generated by the method has good accuracy and language fluency, and has a certain practical value. 本发明公开了一种基于XLNet的文本摘要自动生成方法,主要解决文本摘要自动生成过程中,句子流畅性和准确性不高的问题。其实现过程是:获取成对的文本、摘要数据,构建训练集;构建一个包含所有常见词汇和字符的词
format Patent
fullrecord <record><control><sourceid>epo_EVB</sourceid><recordid>TN_cdi_epo_espacenet_CN111061861A</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>CN111061861A</sourcerecordid><originalsourceid>FETCH-epo_espacenet_CN111061861A3</originalsourceid><addsrcrecordid>eNrjZLAISa0oUUhMKi4pSkwGMkpL8nMTSzKTFdJT81KLgKz8PIXc1JKM_BSFpMTi1BQFID_Cxy-1hIeBNS0xpziVF0pzMyi6uYY4e-imFuTHpxYXJCYDDSiJd_YzNDQ0MDO0MDN0NCZGDQD9SS4F</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>patent</recordtype></control><display><type>patent</type><title>Text abstract automatic generation method based on XLNet</title><source>esp@cenet</source><creator>ZHANG HUAIYU ; LIU HONGYING ; SHANG FANHUA ; SHEN XIONGJIE ; WANG ZHONGSHU ; CHEN SUNHU</creator><creatorcontrib>ZHANG HUAIYU ; LIU HONGYING ; SHANG FANHUA ; SHEN XIONGJIE ; WANG ZHONGSHU ; CHEN SUNHU</creatorcontrib><description>The invention discloses a text abstract automatic generation method based on XLNet, and mainly solves the problems of low sentence fluency and accuracy in the text abstract automatic generation process. The method comprises the following steps: obtaining paired text and abstract data, and constructing a training set; constructing a dictionary containing all common vocabularies and characters; building a backbone network by taking the pre-trained XLNet as an encoder and the Transformer-XL as a decoder; performing word segmentation on the text data in the training set and encoding the text datainto vectors to obtain network input, and finely tuning the network; and performing word segmentation and coding on the test text, and sending the test text to the trained network N to obtain an abstract result. The abstract generated by the method has good accuracy and language fluency, and has a certain practical value. 本发明公开了一种基于XLNet的文本摘要自动生成方法,主要解决文本摘要自动生成过程中,句子流畅性和准确性不高的问题。其实现过程是:获取成对的文本、摘要数据,构建训练集;构建一个包含所有常见词汇和字符的词</description><language>chi ; eng</language><subject>CALCULATING ; COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS ; COMPUTING ; COUNTING ; ELECTRIC DIGITAL DATA PROCESSING ; HANDLING RECORD CARRIERS ; PHYSICS ; PRESENTATION OF DATA ; RECOGNITION OF DATA ; RECORD CARRIERS</subject><creationdate>2020</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&amp;date=20200424&amp;DB=EPODOC&amp;CC=CN&amp;NR=111061861A$$EHTML$$P50$$Gepo$$Hfree_for_read</linktohtml><link.rule.ids>230,308,780,885,25563,76318</link.rule.ids><linktorsrc>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&amp;date=20200424&amp;DB=EPODOC&amp;CC=CN&amp;NR=111061861A$$EView_record_in_European_Patent_Office$$FView_record_in_$$GEuropean_Patent_Office$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>ZHANG HUAIYU</creatorcontrib><creatorcontrib>LIU HONGYING</creatorcontrib><creatorcontrib>SHANG FANHUA</creatorcontrib><creatorcontrib>SHEN XIONGJIE</creatorcontrib><creatorcontrib>WANG ZHONGSHU</creatorcontrib><creatorcontrib>CHEN SUNHU</creatorcontrib><title>Text abstract automatic generation method based on XLNet</title><description>The invention discloses a text abstract automatic generation method based on XLNet, and mainly solves the problems of low sentence fluency and accuracy in the text abstract automatic generation process. The method comprises the following steps: obtaining paired text and abstract data, and constructing a training set; constructing a dictionary containing all common vocabularies and characters; building a backbone network by taking the pre-trained XLNet as an encoder and the Transformer-XL as a decoder; performing word segmentation on the text data in the training set and encoding the text datainto vectors to obtain network input, and finely tuning the network; and performing word segmentation and coding on the test text, and sending the test text to the trained network N to obtain an abstract result. The abstract generated by the method has good accuracy and language fluency, and has a certain practical value. 本发明公开了一种基于XLNet的文本摘要自动生成方法,主要解决文本摘要自动生成过程中,句子流畅性和准确性不高的问题。其实现过程是:获取成对的文本、摘要数据,构建训练集;构建一个包含所有常见词汇和字符的词</description><subject>CALCULATING</subject><subject>COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS</subject><subject>COMPUTING</subject><subject>COUNTING</subject><subject>ELECTRIC DIGITAL DATA PROCESSING</subject><subject>HANDLING RECORD CARRIERS</subject><subject>PHYSICS</subject><subject>PRESENTATION OF DATA</subject><subject>RECOGNITION OF DATA</subject><subject>RECORD CARRIERS</subject><fulltext>true</fulltext><rsrctype>patent</rsrctype><creationdate>2020</creationdate><recordtype>patent</recordtype><sourceid>EVB</sourceid><recordid>eNrjZLAISa0oUUhMKi4pSkwGMkpL8nMTSzKTFdJT81KLgKz8PIXc1JKM_BSFpMTi1BQFID_Cxy-1hIeBNS0xpziVF0pzMyi6uYY4e-imFuTHpxYXJCYDDSiJd_YzNDQ0MDO0MDN0NCZGDQD9SS4F</recordid><startdate>20200424</startdate><enddate>20200424</enddate><creator>ZHANG HUAIYU</creator><creator>LIU HONGYING</creator><creator>SHANG FANHUA</creator><creator>SHEN XIONGJIE</creator><creator>WANG ZHONGSHU</creator><creator>CHEN SUNHU</creator><scope>EVB</scope></search><sort><creationdate>20200424</creationdate><title>Text abstract automatic generation method based on XLNet</title><author>ZHANG HUAIYU ; LIU HONGYING ; SHANG FANHUA ; SHEN XIONGJIE ; WANG ZHONGSHU ; CHEN SUNHU</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-epo_espacenet_CN111061861A3</frbrgroupid><rsrctype>patents</rsrctype><prefilter>patents</prefilter><language>chi ; eng</language><creationdate>2020</creationdate><topic>CALCULATING</topic><topic>COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS</topic><topic>COMPUTING</topic><topic>COUNTING</topic><topic>ELECTRIC DIGITAL DATA PROCESSING</topic><topic>HANDLING RECORD CARRIERS</topic><topic>PHYSICS</topic><topic>PRESENTATION OF DATA</topic><topic>RECOGNITION OF DATA</topic><topic>RECORD CARRIERS</topic><toplevel>online_resources</toplevel><creatorcontrib>ZHANG HUAIYU</creatorcontrib><creatorcontrib>LIU HONGYING</creatorcontrib><creatorcontrib>SHANG FANHUA</creatorcontrib><creatorcontrib>SHEN XIONGJIE</creatorcontrib><creatorcontrib>WANG ZHONGSHU</creatorcontrib><creatorcontrib>CHEN SUNHU</creatorcontrib><collection>esp@cenet</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>ZHANG HUAIYU</au><au>LIU HONGYING</au><au>SHANG FANHUA</au><au>SHEN XIONGJIE</au><au>WANG ZHONGSHU</au><au>CHEN SUNHU</au><format>patent</format><genre>patent</genre><ristype>GEN</ristype><title>Text abstract automatic generation method based on XLNet</title><date>2020-04-24</date><risdate>2020</risdate><abstract>The invention discloses a text abstract automatic generation method based on XLNet, and mainly solves the problems of low sentence fluency and accuracy in the text abstract automatic generation process. The method comprises the following steps: obtaining paired text and abstract data, and constructing a training set; constructing a dictionary containing all common vocabularies and characters; building a backbone network by taking the pre-trained XLNet as an encoder and the Transformer-XL as a decoder; performing word segmentation on the text data in the training set and encoding the text datainto vectors to obtain network input, and finely tuning the network; and performing word segmentation and coding on the test text, and sending the test text to the trained network N to obtain an abstract result. The abstract generated by the method has good accuracy and language fluency, and has a certain practical value. 本发明公开了一种基于XLNet的文本摘要自动生成方法,主要解决文本摘要自动生成过程中,句子流畅性和准确性不高的问题。其实现过程是:获取成对的文本、摘要数据,构建训练集;构建一个包含所有常见词汇和字符的词</abstract><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier
ispartof
issn
language chi ; eng
recordid cdi_epo_espacenet_CN111061861A
source esp@cenet
subjects CALCULATING
COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
COMPUTING
COUNTING
ELECTRIC DIGITAL DATA PROCESSING
HANDLING RECORD CARRIERS
PHYSICS
PRESENTATION OF DATA
RECOGNITION OF DATA
RECORD CARRIERS
title Text abstract automatic generation method based on XLNet
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-13T02%3A51%3A04IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-epo_EVB&rft_val_fmt=info:ofi/fmt:kev:mtx:patent&rft.genre=patent&rft.au=ZHANG%20HUAIYU&rft.date=2020-04-24&rft_id=info:doi/&rft_dat=%3Cepo_EVB%3ECN111061861A%3C/epo_EVB%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true