CUSTODIAN DISAMBIGUATION AND DATA MATCHING

Provided is a technique for matching different user representations of a person in a plurality of computer systems may be provided. The technique includes collecting information sets about user representations from a plurality of computer systems; normalizing the information sets to a unified format...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Hampp-Bahnmueller Thomas A.P, Petrenko Pavlo, Schmid Sebastian B, Bremer Lars, Lorch Markus
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Provided is a technique for matching different user representations of a person in a plurality of computer systems may be provided. The technique includes collecting information sets about user representations from a plurality of computer systems; normalizing the information sets to a unified format; grouping the information sets in the unified format into indexing buckets based on a user name using a non-phonetic algorithm; determining a similarity score for each pair of information sets in each of the indexing buckets; classifying each information set pair into a set of classes based on the similarity scores, wherein the set of classes comprise at least matches and non-matches; and using a data structure for merging information of information set pairs classified as matches.