Self-supervised semantic shift detection and alignment
Generate, for each of the words of a common vocabulary of first and second text corpora, a first word embedding vector in the first text corpus and a second word embedding vector in the second text corpus. Generate, for each word in a random sample of non-landmark words, an artificially shifted word...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Patent |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Generate, for each of the words of a common vocabulary of first and second text corpora, a first word embedding vector in the first text corpus and a second word embedding vector in the second text corpus. Generate, for each word in a random sample of non-landmark words, an artificially shifted word embedding vector by modifying the first word embedding vector for that word. Train a machine learning classifier to predict whether an artificial shift has been injected for a given word, based on the artificially shifted word embedding vector and the second word embedding vector for the given word. Predict semantic shifts for at least a plurality of the words of the common vocabulary by providing the first word embedding vectors and the second word embedding vectors for at least the plurality of the words of the common vocabulary as input to the trained machine learning classifier. |
---|