Cross-lingual Word Sense Disambiguation using mBERT Embeddings with Syntactic Dependencies
Cross-lingual word sense disambiguation (WSD) tackles the challenge of disambiguating ambiguous words across languages given context. The pre-trained BERT embedding model has been proven to be effective in extracting contextual information of words, and have been incorporated as features into many s...
Gespeichert in:
1. Verfasser: | |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Cross-lingual word sense disambiguation (WSD) tackles the challenge of
disambiguating ambiguous words across languages given context. The pre-trained
BERT embedding model has been proven to be effective in extracting contextual
information of words, and have been incorporated as features into many
state-of-the-art WSD systems. In order to investigate how syntactic information
can be added into the BERT embeddings to result in both semantics- and
syntax-incorporated word embeddings, this project proposes the concatenated
embeddings by producing dependency parse tress and encoding the relative
relationships of words into the input embeddings. Two methods are also proposed
to reduce the size of the concatenated embeddings. The experimental results
show that the high dimensionality of the syntax-incorporated embeddings
constitute an obstacle for the classification task, which needs to be further
addressed in future studies. |
---|---|
DOI: | 10.48550/arxiv.2012.05300 |