Reassembling Shredded Document Stripes Using Word-Path Metric and Greedy Composition Optimal Matching Solver

This paper develops a shredded document reassembly algorithm based on character/word detection. A new word compatibility estimation metric and a searching strategy called Greedy Composition and Optimal Matching (GCOM) are proposed to compose documents from their vertically shredded stripes. We reduc...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on multimedia 2020-05, Vol.22 (5), p.1168-1181
Hauptverfasser: Liang, Yongqing, Li, Xin
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:This paper develops a shredded document reassembly algorithm based on character/word detection. A new word compatibility estimation metric and a searching strategy called Greedy Composition and Optimal Matching (GCOM) are proposed to compose documents from their vertically shredded stripes. We reduce the stripe puzzle reassembly problem to the traveling salesman problem (TSP) on a sparse graph. The word-path compatibility metric takes advantages of the optical character recognition (OCR) to compute the compatibility score among a group of stripes. The global composition strategy, based on an integration of greedy composition and optimal matching, is proposed to search for a maximal Hamiltonian path and the final global reassembly. We demonstrate that our solver outperforms the state-of-the-art puzzle solvers on reassembling stripe shredded documents.
ISSN:1520-9210
1941-0077
DOI:10.1109/TMM.2019.2941777