File classification labeling method and system based on sequence coding

The invention belongs to the technical field of text classification, and provides a file classification labeling method and system based on sequence coding, and the method comprises the following steps: obtaining the position features of a to-be-classified file; according to the obtained position fe...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: WANG ZHELONG, ZOU XILIN, LIU YIJUAN, GAO YUHUA, LI JING, LI ZHAORU, LI LI, WANG QIAN, WANG RUOHAN, WU XUEXIA, XU MEILING, CHEN YUNLONG, HOU YANWEN, ZHANG XUEMEI, LIU JIYAN, JU WENJIE, YU XIANGJIE, SUI XIN, WANG WEISHUAI, REN CHANGYU
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The invention belongs to the technical field of text classification, and provides a file classification labeling method and system based on sequence coding, and the method comprises the following steps: obtaining the position features of a to-be-classified file; according to the obtained position features, word embedding is carried out on syntactic and semantic information of word levels in the to-be-classified files, sequence coding is carried out on relation and structure information of sentence levels in the to-be-classified files, and conversion of the to-be-classified files from a document space to a vector space is completed; extracting vector space features of the files to be classified, and performing sequence coding on the extracted vector space features; and classifying the files based on the sequence codes of the vector space features and a preset file classification model. 本公开属于文本分类技术领域,提供了一种基于序列编码的文件分类标注方法与系统,包括以下步骤:获取待分类文件的位置特征;根据所获取的位置特征,对待分类文件中单词级的句法和语义信息进行词嵌入,对待分类文件中的句子级的关系和结构信息进行序列编码,完成待分类文件