IDENTIFYING PERSONALLY IDENTIFIABLE INFORMATION WITHIN AN UNSTRUCTURED DATA STORE

Methods and systems for identifying personally identifiable information (PII) are disclosed. In some aspects, frequency maps of fields storing known PII information are generated. The frequency maps may count occurrences of unique bigrams in the PII fields. A field of interest may then be analyzed t...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Huang, Dachuan, Sankuratripati, Subhash, Pihur, Vasyl, Fortier, Leah
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Methods and systems for identifying personally identifiable information (PII) are disclosed. In some aspects, frequency maps of fields storing known PII information are generated. The frequency maps may count occurrences of unique bigrams in the PII fields. A field of interest may then be analyzed to generate a second frequency map. Correlations between the first frequency maps and the second frequency map may be generated. If one of the correlations meets certain criterion, the disclosed embodiments may determine that the field of interest does or does not include PII. Access control for the field of interest may then be based on whether the field includes PII. In some aspects, a storage location of data included in the field of interest may be based on whether the field includes PII.