Method for collecting data of word segmentation dictionary based on statistical machine learning method
The invention relates to the field of data processing foundations and specifically relates to a method for collecting data of a word segmentation dictionary based on a statistical machine learning method. The method comprises the steps that the machine learning method is applied; a classification id...
Gespeichert in:
Hauptverfasser: | , , , , , , , , , , , |
---|---|
Format: | Patent |
Sprache: | chi ; eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The invention relates to the field of data processing foundations and specifically relates to a method for collecting data of a word segmentation dictionary based on a statistical machine learning method. The method comprises the steps that the machine learning method is applied; a classification idea is used to acquire a domain concept; a domain concept acquisition problem is deemed as a binary classification problem; the concept is acquired and processed; collected information or data is processed; an information database and an index database are established; data contents desired by a user are formed; a response is made to various types of retrieval proposed by the user; and information or relevant pointers required by the user can be provided. In this way, accuracy of information retrieval is increased.
本发明涉及数据处理基础领域,具体来说是种基于统计机器学习方法的分词字典数据采集方法,利用机器学习的方法,采用分类思想获取领域概念,把领域概念获取问题看成是个二值分类问题,进行概念的获取及处理,从而对采集信息或数据进行加工,建立信息数据库和索引数据库,形成用户想要的数据内容,对用户提出的各种检索做出响应,为提供用户所需的信息或相关指针,从而提高了信息检索的准确率和准确率。 |
---|