ANCIENT WRITING CHARACTER RECOGNITION SYSTEM, ANCIENT WRITING CHARACTER RECOGNITION METHOD AND PROGRAM

To provide an ancient writing character recognition system capable of converting an ancient writing character included in a literature, such as Japanese classic book and an ancient document (hereafter, refer to a literature) into a modern character with high accuracy by suppressing erroneous recogni...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
1. Verfasser:	OKA TOSHIO
Format:	Patent
Sprache:	eng ; jpn
Schlagworte:	CALCULATING COMPUTING COUNTING HANDLING RECORD CARRIERS IMAGE DATA PROCESSING OR GENERATION, IN GENERAL PHYSICS PRESENTATION OF DATA RECOGNITION OF DATA RECORD CARRIERS
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	To provide an ancient writing character recognition system capable of converting an ancient writing character included in a literature, such as Japanese classic book and an ancient document (hereafter, refer to a literature) into a modern character with high accuracy by suppressing erroneous recognition due to individual difference of a character in each literature.SOLUTION: According to the present invention, an ancient writing character recognition system includes a machine learning model generation part for generating a machine learning model (hereafter, simply refers to a model) by a sample of a literature described with ancient writing characters, and an ancient writing character processing part for performing character recognition from a character image with an ancient writing characters acquired from a sentence image of the literature to modern characters. The machine learning model generation part causes the model to be learned with any, a combination, or all of a first learning method for learning the model by mixing samples of respective literatures, a second learning method for sequentially learning the model with a sample of a literature, a third learning method for learning the model by changing dimensions of feature information of the sample of the literature, and a fourth learning method for learning the model by using Fisher information quantity corresponding to the sample of the literature.SELECTED DRAWING: Figure 1 【課題】日本の古典籍や古文書（以下、文献と示す）などの文献に含まれるくずし字を、それぞれの文献における文字の個体差による誤認識を抑制し、精度良く現代の文字に変換することが可能なくずし字認識システムを提供する。【解決手段】本発明のくずし字認識システムは、くずし字で記載されている文献の標本により、機械学習モデル（以下、単にモデル）を生成する機械学習モデル生成部と、文献の文章画像から取得したくずし字の文字画像から現代文字への文字認識をモデルで行うくずし字処理部とを備え、機械学習モデル生成部が、各文献の標本を混合してモデルを学習する第１学習方法、或いは文献の標本で順番にモデルを学習する第２学習方法、又は文献の標本の特徴情報の次元を変更させてモデルを学習する第３学習方法、文献の標本に対応したフィッシャー情報量を用いてモデルを学習する第４学習方法のいずれか、或いは組合せ、又は全てでモデルを学習させる。【選択図】図１