Named entity recognition on chat data
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for receiving a plurality of word strings in a first language, each received word string comprising a plurality of words, identifying one or more named entities in each received word string using a sta...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Patent |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for receiving a plurality of word strings in a first language, each received word string comprising a plurality of words, identifying one or more named entities in each received word string using a statistical classifier that was trained using training data comprising a plurality of features, wherein one of the features is a word shape feature that comprises a respective token for each letter of a respective word wherein each token signifies a case of the letter or whether the letter is a digit, and translating the received word strings from the first language to a second language including preserving the respective identified named entities in each received word string during translation. |
---|