ENTITY MATCHING WITH JOINT LEARNING OF BLOCKING AND MATCHING
A method of identifying entities from different data sources as matching entity pairs that refer to a same real-world object is provided. A set of labelling functions are provided to determine matching entities and non-matching entities of a source data set and a least one target data set. A subset...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Patent |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | A method of identifying entities from different data sources as matching entity pairs that refer to a same real-world object is provided. A set of labelling functions are provided to determine matching entities and non-matching entities of a source data set and a least one target data set. A subset of labelling functions are selected from the provided set of labelling functions for training machine learning models for a blocking module that aims at filtering out as many unmatched entity pairs as possible without missing any true matches and for a matching module that aims at predicting matching results for remaining entity pairs not filtered out by the blocking module. Both a blocking model for the blocking module and a matching model are jointly learned for the matching module based on available unlabeled entity pairs and the labelling functions of the selected subset of labelling functions. |
---|