Document marking techniques using semantically similar phrases for document source detection

Unique copies of an original document can be generated and provided to individual recipients. The unique copies can be used to identify the source of a document leak. The unique copies are generated by replacing terms within the original document with alternative terms. The alternative terms are det...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Batterberry, Troy, Beck, Parker, Krishnan, Sanjay, Bianamara, Stephen, Huthwaite, Clayton, Saunders, Colin, Voss, Chad, Wong, David
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Unique copies of an original document can be generated and provided to individual recipients. The unique copies can be used to identify the source of a document leak. The unique copies are generated by replacing terms within the original document with alternative terms. The alternative terms are determined using a first machine learning model that receives a term from the document and outputs the alternative terms. The output alternative terms are provided to a second machine learning model that indicates a tone for each alternative term. The tone of the alternative terms is compared to the tone of the term from the original document, and one or more of the alternative terms are selected based on the tone of the alternative terms relative to the tone of the document term. The alternative terms used to generate the unique copies have a same or similar tone as the document term.