Systems And Methods For Training Translation Models Using Source-Augmented Training Examples

Systems and methods for training a translation model based on a first text sequence in a first language, a second text sequence in a second language different from the first language, and a label based on a source of the second text sequence. In some examples, the label may comprise an Internet doma...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Ratnakar, Viresh, Krikun, Maxim, Huang, Jing, Johnson, Melvin, Shah, Apurva
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Systems and methods for training a translation model based on a first text sequence in a first language, a second text sequence in a second language different from the first language, and a label based on a source of the second text sequence. In some examples, the label may comprise an Internet domain, an Internet subdomain, a uniform resource locator, a website name, or an IP address. In some examples, the label may further indicate a source of the first text sequence. In some examples, each given training example may be automatically generated by sampling the first text sequence from a first page of a given Internet domain, sampling the second text sequence from a second page of the given Internet domain, and generating the label based on all or a portion of source data of the second page.