CROSS-LINGUAL UNSUPERVISED CLASSIFICATION WITH MULTI-VIEW TRANSFER LEARNING

Presented herein are embodiments of an unsupervised cross-lingual sentiment classification model (which may be referred to as multi-view encoder-classifier (MVEC)) that leverages an unsupervised machine translation (UMT) system and a language discriminator. Unlike previous language model (LM)-based...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Fei, Hongliang, Li, Ping
Format: Patent
Sprache:eng ; fre ; ger
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Presented herein are embodiments of an unsupervised cross-lingual sentiment classification model (which may be referred to as multi-view encoder-classifier (MVEC)) that leverages an unsupervised machine translation (UMT) system and a language discriminator. Unlike previous language model (LM)-based fine-tuning approaches that adjust parameters solely based on the classification error on training data, embodiments employ an encoder-decoder framework of an UMT as a regularization component on the shared network parameters. In one or more embodiments, the cross-lingual encoder of embodiments learns a shared representation, which is effective for both reconstructing input sentences of two languages and generating more representative views from the input for classification. Experiments on five language pairs verify that an MVEC embodiment significantly outperforms other models for 8/11 sentiment classification tasks.