The line segmentation algorithm of Indonesian electronic identity card (e-KTP) for data digitization
The Indonesian Electronic Identity Card (e-KTP) become a source of information for its owner identity which has a lot of use in administrative purpose. The biodata segment of e-KTP consisted of multiple lines, each of the lines is unique in terms of length and wide which is become a problem in digit...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Tagungsbericht |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The Indonesian Electronic Identity Card (e-KTP) become a source of information for its owner identity which has a lot of use in administrative purpose. The biodata segment of e-KTP consisted of multiple lines, each of the lines is unique in terms of length and wide which is become a problem in digitizing data using Optical Character Recognition (OCR). Therefore, line segmentation algorithm must be applied, this research proposed the line segmentation algorithm using rectangular cropping and Tesseract OCR. First, the algorithm cropped the owner biodata and the line indicator. There are three line indicators, which are below the ‘alamat’,’tempat/tanggal lahir’ and ‘nama’ area. Then, OCR reads all of the cropped area. If the line indicator value is blank, then those segment known has two lines. The OCR result is converted into an array which is separated by lines. The algorithm exercised into four different conditions which are, e-KTP with two lines of address; two lines of date and place of birth; two lines of name; and one lines for every segment. Result of the applied algorithm manage to reach 85% from 30 samples. Failure in line segmentation is caused by a threshold value that is not optimal. |
---|---|
ISSN: | 0094-243X 1551-7616 |
DOI: | 10.1063/5.0000670 |