DETECTING PERSONALLY IDENTIFIABLE INFORMATION IN DATA ASSOCIATED WITH A CLOUD COMPUTING SYSTEM

Methods and systems for detecting personally identifiable information in data associated with a cloud computing system are described. An example method includes ingesting the data associated with the cloud computing system to generate source data. The method includes processing the source data by: p...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: CHINTALAPATI, Sekhar Poornananda, YEOLE, Gaurav Anil, PEICU, Mihai Silviu, RAJPURE, Dattatraya Baban, YELAHANKA SRINIVAS, Vinod Kumar, BROUWER, Pieter Kristian
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Methods and systems for detecting personally identifiable information in data associated with a cloud computing system are described. An example method includes ingesting the data associated with the cloud computing system to generate source data. The method includes processing the source data by: performing cell-based de-duplication to generate cell-based de-duplicated data, subjecting the cell-based de-duplicated data to regular expression classification to generate a first subset of initial results, tokenizing the cell-based de-duplicated data to generate tokenized data, and de-duplicating the tokenized data and subjecting de-duplicated tokenized data to a first named entity recognition classification to generate a second subset of the initial results. The method includes cross-referencing the cell-based de-duplicated data and the initial results and subjecting output of the cross-referencing to a second named entity recognition classification to generate final results. The method includes processing the final results to detect any personally identifiable information in the final results.