Cross-lingual Information Retrieval with BERT
Multiple neural language models have been developed recently, e.g., BERT and XLNet, and achieved impressive results in various NLP tasks including sentence classification, question answering and document ranking. In this paper, we explore the use of the popular bidirectional language model, BERT, to...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Multiple neural language models have been developed recently, e.g., BERT and
XLNet, and achieved impressive results in various NLP tasks including sentence
classification, question answering and document ranking. In this paper, we
explore the use of the popular bidirectional language model, BERT, to model and
learn the relevance between English queries and foreign-language documents in
the task of cross-lingual information retrieval. A deep relevance matching
model based on BERT is introduced and trained by finetuning a pretrained
multilingual BERT model with weak supervision, using home-made CLIR training
data derived from parallel corpora. Experimental results of the retrieval of
Lithuanian documents against short English queries show that our model is
effective and outperforms the competitive baseline approaches. |
---|---|
DOI: | 10.48550/arxiv.2004.13005 |