Word Embedding Dimension Reduction via Weakly-Supervised Feature Selection
As a fundamental task in natural language processing, word embedding converts each word into a representation in a vector space. A challenge with word embedding is that as the vocabulary grows, the vector space's dimension increases, which can lead to a vast model size. Storing and processing w...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | As a fundamental task in natural language processing, word embedding converts
each word into a representation in a vector space. A challenge with word
embedding is that as the vocabulary grows, the vector space's dimension
increases, which can lead to a vast model size. Storing and processing word
vectors are resource-demanding, especially for mobile edge-devices
applications. This paper explores word embedding dimension reduction. To
balance computational costs and performance, we propose an efficient and
effective weakly-supervised feature selection method named WordFS. It has two
variants, each utilizing novel criteria for feature selection. Experiments on
various tasks (e.g., word and sentence similarity and binary and multi-class
classification) indicate that the proposed WordFS model outperforms other
dimension reduction methods at lower computational costs. We have released the
code for reproducibility along with the paper. |
---|---|
DOI: | 10.48550/arxiv.2407.12342 |