An Eclectic Approach for Enhancing Language Models Through Rich Embedding Features

Text processing is a fundamental aspect of Natural Language Processing (NLP) and is crucial for various applications in fields such as artificial intelligence, data science, and information retrieval. It plays a core role in language models. Most text-processing approaches focus on describing and sy...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE access 2024, Vol.12, p.100921-100938
Hauptverfasser:	Aldana-Bobadilla, Edwin, Sosa-Sosa, Victor Jesus, Molina-Villegas, Alejandro, Gazca-Hernandez, Karina, Olivas, Jose Angel
Format:	Artikel
Sprache:	eng
Schlagworte:	Feature extraction Linguistics Natural language processing Neurons Self-organizing feature maps Self-organizing map Semantics Task analysis Text analysis Transformers word embeddings
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Text processing is a fundamental aspect of Natural Language Processing (NLP) and is crucial for various applications in fields such as artificial intelligence, data science, and information retrieval. It plays a core role in language models. Most text-processing approaches focus on describing and synthesizing, to a greater or lesser degree, lexical, syntactic, and semantic properties of text in the form of numerical vectors that induce a metric space, in which, it is possible to find underlying patterns and structures related to the original text. Since each approach has strengths and weaknesses, finding a single approach that perfectly extracts representative text properties for every task and application domain is hard. This paper proposes a novel approach capable of synthesizing information from heterogeneous state-of-the-art text processing approaches into a unified representation. Encouraging results demonstrate that using this representation in popular machine-learning tasks not only leads to superior performance but also offers notable advantages in memory efficiency and preservation of underlying information of the distinct sources involved in such a representation.
ISSN:	2169-3536 2169-3536
DOI:	10.1109/ACCESS.2024.3422971