Reassembling Shredded Document Stripes Using Word-Path Metric and Greedy Composition Optimal Matching Solver
This paper develops a shredded document reassembly algorithm based on character/word detection. A new word compatibility estimation metric and a searching strategy called Greedy Composition and Optimal Matching (GCOM) are proposed to compose documents from their vertically shredded stripes. We reduc...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on multimedia 2020-05, Vol.22 (5), p.1168-1181 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | This paper develops a shredded document reassembly algorithm based on character/word detection. A new word compatibility estimation metric and a searching strategy called Greedy Composition and Optimal Matching (GCOM) are proposed to compose documents from their vertically shredded stripes. We reduce the stripe puzzle reassembly problem to the traveling salesman problem (TSP) on a sparse graph. The word-path compatibility metric takes advantages of the optical character recognition (OCR) to compute the compatibility score among a group of stripes. The global composition strategy, based on an integration of greedy composition and optimal matching, is proposed to search for a maximal Hamiltonian path and the final global reassembly. We demonstrate that our solver outperforms the state-of-the-art puzzle solvers on reassembling stripe shredded documents. |
---|---|
ISSN: | 1520-9210 1941-0077 |
DOI: | 10.1109/TMM.2019.2941777 |