Neologism classification techniques with trigrams and longest common subsequences
Techniques are provided for identifying attributes associated with a neologism or an unknown word or name. Real world characteristics can be predicted for the neologism. Trigrams are identified for an input word and word embedding model vector values are calculated for the identified trigrams and en...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Patent |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Techniques are provided for identifying attributes associated with a neologism or an unknown word or name. Real world characteristics can be predicted for the neologism. Trigrams are identified for an input word and word embedding model vector values are calculated for the identified trigrams and entered into a matrix. Trigrams are identified for nearest names. Classification values are calculated based on the trigrams for the input word and the trigrams from the nearest names and the classification values are entered into the matrix. A convolutional neural network can process the matrix to identify one or more characteristics associated with the neologism. |
---|