MOSS: An Open Conversational Large Language Model

Conversational large language models (LLMs) such as ChatGPT and GPT-4 have recently exhibited remarkable capabilities across various domains, capturing widespread attention from the public. To facilitate this line of research, in this paper, we report the development of MOSS, an open-sourced convers...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	International journal of automation and computing 2024-10, Vol.21 (5), p.888-905
Hauptverfasser:	Sun, Tianxiang, Zhang, Xiaotian, He, Zhengfu, Li, Peng, Cheng, Qinyuan, Liu, Xiangyang, Yan, Hang, Shao, Yunfan, Tang, Qiong, Zhang, Shiduo, Zhao, Xingjian, Chen, Ke, Zheng, Yining, Zhou, Zhejian, Li, Ruixiao, Zhan, Jun, Zhou, Yunhua, Li, Linyang, Yang, Xiaogui, Wu, Lingling, Yin, Zhangyue, Huang, Xuanjing, Jiang, Yu-Gang, Qiu, Xipeng
Format:	Artikel
Sprache:	eng
Schlagworte:	Artificial Intelligence Chatbots Computer Science Effectiveness Large language models Research Article
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Conversational large language models (LLMs) such as ChatGPT and GPT-4 have recently exhibited remarkable capabilities across various domains, capturing widespread attention from the public. To facilitate this line of research, in this paper, we report the development of MOSS, an open-sourced conversational LLM that contains 16 B parameters and can perform a variety of instructions in multi-turn interactions with humans. The base model of MOSS is pre-trained on large-scale unlabeled English, Chinese, and code data. To optimize the model for dialogue, we generate 1.1 M synthetic conversations based on user prompts collected through our earlier versions of the model API. We then perform preference-aware training on preference data annotated from AI feedback. Evaluation results on real-world use cases and academic benchmarks demonstrate the effectiveness of the proposed approaches. In addition, we present an effective practice to augment MOSS with several external tools. Through the development of MOSS, we have established a complete technical roadmap for large language models from pre-training, supervised fine-tuning to alignment, verifying the feasibility of chatGPT under resource-limited conditions and providing a reference for both the academic and industrial communities. Model weights and code are publicly available at https://github.com/OpenMOSS/MOSS .
ISSN:	2731-538X 2153-182X 1476-8186 2731-5398 2153-1838 1751-8520
DOI:	10.1007/s11633-024-1502-8