FORM DOCUMENT REGISTRATION METHOD AND PROGRAM

To provide a technique relating to an OCR, capable of identifying a form document even when there are many types of target form documents including a non-fixed form document.SOLUTION: A form document registration method includes, as steps executed by a computer, a step (S1) in which an image of a ta...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: SHINJO HIROSHI, TSUTSUMI YASUTAKA
Format: Patent
Sprache:eng ; jpn
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:To provide a technique relating to an OCR, capable of identifying a form document even when there are many types of target form documents including a non-fixed form document.SOLUTION: A form document registration method includes, as steps executed by a computer, a step (S1) in which an image of a target form document is input, an image feature processing step (S3) in which an image feature is calculated from an image of the target form document, and a similar form document including an image feature similar to an image feature of the image of the target form document is selected, a step (S4) in which a character string is extracted from the image of the target form document with optical character recognition, a character feature processing step (S5) in which a character string registered as the character feature of the target form document is selected based on comparison of the character string extracted from the image of the target form document with the character string included as the character feature in the similar form document, and a registration step (S7) in which the image feature and the character feature of the target form document are registered in a dictionary.SELECTED DRAWING: Figure 2 【課題】OCRの技術に関して、非定型帳票を含め対象帳票の種類が多い場合でも、帳票を識別することができる技術を提供する。【解決手段】帳票登録方法は、コンピュータによって実行されるステップとして、対象帳票の画像を入力するステップ(S1)と、対象帳票の画像から画像特徴を計算し、登録済みの帳票から、対象帳票の画像の画像特徴と類似する画像特徴を有する類似帳票を選択する画像特徴処理ステップ(S3)と、対象帳票の画像から光学文字認識によって文字列を抽出するステップ(S4)と、対象帳票の画像から抽出された文字列と、類似帳票が文字特徴として有する文字列との比較に基づいて、対象帳票の文字特徴として登録する文字列を選択する文字特徴処理ステップ(S5)と、対象帳票の画像特徴および文字特徴を辞書に登録する登録ステップ(S7)とを有する。【選択図】図2