Image character information extraction method and system and storage medium

The invention discloses an image text information extraction method and system and a storage medium, and the method comprises the steps: carrying out the text conversion of an image-text data set, obtaining text data, carrying out the word segmentation of the text data, calculating the similarity be...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: CHEN GUANSHENG, CHEN TAO, FENG XINYAO, WU MENGWEI, WEI ZILI, ZHANG YINCUI
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The invention discloses an image text information extraction method and system and a storage medium, and the method comprises the steps: carrying out the text conversion of an image-text data set, obtaining text data, carrying out the word segmentation of the text data, calculating the similarity between each word group and each theme feature word in a preset theme feature word bank, taking the phrases with the similarity greater than a preset similarity as key feature words so as to filter out text data deviating from a theme, meanwhile, assigning weights to the key feature words so as to divide all the phrases into hot words and non-hot words, filtering out non-key image-text data according to the number of the non-hot words, and filtering the non-key image-text data according to the number of the non-hot words, for the non-key image-text data, the volume of the non-key image-text data is reduced, the occupied space of the image and text extraction process is reduced, meanwhile, candidate box labeling is pe