REMOVAL OF SENSITIVE DATA FROM DOCUMENTS FOR USE AS TRAINING SETS

Systems and methods relating to the replacement or removal of sensitive data in images of documents. An initial image of a document with sensitive data is received at an execution module and changes are made based on the execution module's training. The changes include replacing or effectively...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: DE BERKER, Archy Otto, MARCOTTE, Étienne, GUAY, Philippe, TOURILLON, Dominique
Format: Patent
Sprache:eng ; fre ; ger
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Systems and methods relating to the replacement or removal of sensitive data in images of documents. An initial image of a document with sensitive data is received at an execution module and changes are made based on the execution module's training. The changes include replacing or effectively removing the sensitive data from the image of the document. The resulting sanitized image is then sent to a user for validation of the changes. The feedback from the user is then used in training the execution module to refine its behaviour when applying changes to other initial images of documents. To train the execution module, training data sets of document images with sensitive data manually tagged by users are used. The execution module thus learns to identify sensitive data and its submodules replace that sensitive data with suitable replacement data. The feedback from the user works to improve the resulting sanitized images from the execution module.