Understanding LLMs: A comprehensive overview from training to inference

The introduction of ChatGPT has led to a significant increase in the utilization of Large Language Models (LLMs) for addressing downstream tasks. There is an increasing focus on cost-efficient training and deployment within this context. Low-cost training and deployment of LLMs represent the future...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Neurocomputing (Amsterdam) 2025-03, Vol.620, p.129190, Article 129190
Hauptverfasser: Liu, Yiheng, He, Hao, Han, Tianle, Zhang, Xu, Liu, Mengyuan, Tian, Jiaming, Zhang, Yutong, Wang, Jiaqi, Gao, Xiaohui, Zhong, Tianyang, Pan, Yi, Xu, Shaochen, Wu, Zihao, Liu, Zhengliang, Zhang, Xin, Zhang, Shu, Hu, Xintao, Zhang, Tuo, Qiang, Ning, Liu, Tianming, Ge, Bao
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The introduction of ChatGPT has led to a significant increase in the utilization of Large Language Models (LLMs) for addressing downstream tasks. There is an increasing focus on cost-efficient training and deployment within this context. Low-cost training and deployment of LLMs represent the future development trend. This paper reviews the evolution of LLMs training techniques and inference deployment technologies aligned with this emerging trend. The objective is to provide researchers with a guide for integrating LLMs into their work. The discussion on training includes various aspects, including data preprocessing, training architecture, pre-training tasks, parallel training, and relevant content related to model fine-tuning. On the inference side, the paper covers topics such as model compression, parallel computation, memory scheduling, and structural optimization. It also explores LLMs’ utilization and provides insights into their future development.
ISSN:0925-2312
DOI:10.1016/j.neucom.2024.129190