Active Learning for Matching Heterogeneous Entity Representations with Language Models
A system, computer program product, and method are provided for active learning (AL) for matching heterogeneous entity representations. The task in entity resolution (ER) is to find pairs from datasets that correspond to the same entity. A labeled training dataset is leveraged to train a first artif...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Patent |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | A system, computer program product, and method are provided for active learning (AL) for matching heterogeneous entity representations. The task in entity resolution (ER) is to find pairs from datasets that correspond to the same entity. A labeled training dataset is leveraged to train a first artificial intelligence (AI) model, with the first AI model training employing a pre-trained language model. A second AI model is trained with the language model updated by the first AI model, with the second AI model creating a candidate set of likely duplicate pairs. A subset is selectively identified from the candidate set. The labeled training set is augmented with the subset. |
---|