Annotation method of risk data in a certain field based on pattern matching

With the development of information technology and the increasing complexity of industrial technology, there is an urgent need for a certain field to use big data and artificial intelligence to improve the management and decision-making level. In order to classify the field’s risk text data through...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:E3S web of conferences 2024-01, Vol.522, p.1046
Hauptverfasser: Geng, Weibo, Zhao, Yingxiao, Xu, Ping, Cai, Jiaoyang, Fang, Fang
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:With the development of information technology and the increasing complexity of industrial technology, there is an urgent need for a certain field to use big data and artificial intelligence to improve the management and decision-making level. In order to classify the field’s risk text data through intelligent algorithms, analysing the risk distribution and the major problems, this paper researches on the annotation methods of training data in this field. The proposed data annotation method is based on pattern matching, addressing the special problems of risk data annotation in this field (such as strong professionalism, small data volume, high accuracy requirement and timeliness requirements). A new matching pattern is generated through the steps of text segmentation, keyword extraction, pattern preliminary generation, pattern relation tree construction, pattern optimization, pattern generalization, pattern verification, classification and annotation, and final classification and annotation are performed after pattern matching. Performance tests in terms of accuracy, recall rate, and annotation time have shown that the overall performance of the proposed method outperforms that of traditional item-by-item manual annotation, and semi-automatic annotation methods through machine learning. The method described in this paper has strong application value for risk data annotation in this field, and also has certain reference significance for high-density, high-accuracy and high-timeliness data annotation in other fields.
ISSN:2267-1242
2267-1242
DOI:10.1051/e3sconf/202452201046