YAYI 2: Multilingual Open-Source Large Language Models

As the latest advancements in natural language processing, large language models (LLMs) have achieved human-level language understanding and generation abilities in many real-world tasks, and even have been regarded as a potential path to the artificial general intelligence. To better facilitate res...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Luo, Yin, Kong, Qingchao, Xu, Nan, Cao, Jia, Hao, Bao, Qu, Baoyu, Chen, Bo, Zhu, Chao, Zhao, Chenyang, Zhang, Donglei, Feng, Fan, Zhao, Feifei, Sun, Hailong, Yang, Hanxuan, Pan, Haojun, Liu, Hongyu, Guo, Jianbin, Du, Jiangtao, Wang, Jingyi, Li, Junfeng, Sun, Lei, Liu, Liduo, Dong, Lifeng, Liu, Lili, Wang, Lin, Zhang, Liwen, Wang, Minzheng, Wang, Pin, Yu, Ping, Li, Qingxiao, Yan, Rui, Zou, Rui, Li, Ruiqun, Huang, Taiwen, Wang, Xiaodong, Wu, Xiaofei, Peng, Xin, Zhang, Xina, Fang, Xing, Xiao, Xinglin, Hao, Yanni, Dong, Yao, Wang, Yigang, Liu, Ying, Jiang, Yongyu, Wang, Yungan, Wang, Yuqi, Wang, Zhangsheng, Yu, Zhaoxin, Luo, Zhen, Mao, Wenji, Wang, Lei, Zeng, Dajun
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Luo, Yin
Kong, Qingchao
Xu, Nan
Cao, Jia
Hao, Bao
Qu, Baoyu
Chen, Bo
Zhu, Chao
Zhao, Chenyang
Zhang, Donglei
Feng, Fan
Zhao, Feifei
Sun, Hailong
Yang, Hanxuan
Pan, Haojun
Liu, Hongyu
Guo, Jianbin
Du, Jiangtao
Wang, Jingyi
Li, Junfeng
Sun, Lei
Liu, Liduo
Dong, Lifeng
Liu, Lili
Wang, Lin
Zhang, Liwen
Wang, Minzheng
Wang, Pin
Yu, Ping
Li, Qingxiao
Yan, Rui
Zou, Rui
Li, Ruiqun
Huang, Taiwen
Wang, Xiaodong
Wu, Xiaofei
Peng, Xin
Zhang, Xina
Fang, Xing
Xiao, Xinglin
Hao, Yanni
Dong, Yao
Wang, Yigang
Liu, Ying
Jiang, Yongyu
Wang, Yungan
Wang, Yuqi
Wang, Zhangsheng
Yu, Zhaoxin
Luo, Zhen
Mao, Wenji
Wang, Lei
Zeng, Dajun
description As the latest advancements in natural language processing, large language models (LLMs) have achieved human-level language understanding and generation abilities in many real-world tasks, and even have been regarded as a potential path to the artificial general intelligence. To better facilitate research on LLMs, many open-source LLMs, such as Llama 2 and Falcon, have recently been proposed and gained comparable performances to proprietary models. However, these models are primarily designed for English scenarios and exhibit poor performances in Chinese contexts. In this technical report, we propose YAYI 2, including both base and chat models, with 30 billion parameters. YAYI 2 is pre-trained from scratch on a multilingual corpus which contains 2.65 trillion tokens filtered by our pre-training data processing pipeline. The base model is aligned with human values through supervised fine-tuning with millions of instructions and reinforcement learning from human feedback. Extensive experiments on multiple benchmarks, such as MMLU and CMMLU, consistently demonstrate that the proposed YAYI 2 outperforms other similar sized open-source models.
doi_str_mv 10.48550/arxiv.2312.14862
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2312_14862</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2312_14862</sourcerecordid><originalsourceid>FETCH-LOGICAL-a672-c0a3efb445dcf47343d7f30b406e82a3da87cd5be0298989adf5d1db0cbd07123</originalsourceid><addsrcrecordid>eNotT0uLwjAYzMWD6P4AT9s_0JpnE72JrA-oeFgvnsqXfIkU4oO4lfXfa1UGZgYGhhlCRowW0ihFx5D-m1vBBeMFk6bkfVLuZ_t1xqfZpo1_TWxOhxZitr34U_57bpPzWQXp0HGXPM3mjD5eh6QXIF7910cHZLf42c1XebVdruezKodS89xRED5YKRW6ILWQAnUQ1EpaesNBIBjtUFlP-cQ8ARgUMrTUWaSacTEg3-_a1_D6kpojpHvdHahfB8QDQ68_bQ</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>YAYI 2: Multilingual Open-Source Large Language Models</title><source>arXiv.org</source><creator>Luo, Yin ; Kong, Qingchao ; Xu, Nan ; Cao, Jia ; Hao, Bao ; Qu, Baoyu ; Chen, Bo ; Zhu, Chao ; Zhao, Chenyang ; Zhang, Donglei ; Feng, Fan ; Zhao, Feifei ; Sun, Hailong ; Yang, Hanxuan ; Pan, Haojun ; Liu, Hongyu ; Guo, Jianbin ; Du, Jiangtao ; Wang, Jingyi ; Li, Junfeng ; Sun, Lei ; Liu, Liduo ; Dong, Lifeng ; Liu, Lili ; Wang, Lin ; Zhang, Liwen ; Wang, Minzheng ; Wang, Pin ; Yu, Ping ; Li, Qingxiao ; Yan, Rui ; Zou, Rui ; Li, Ruiqun ; Huang, Taiwen ; Wang, Xiaodong ; Wu, Xiaofei ; Peng, Xin ; Zhang, Xina ; Fang, Xing ; Xiao, Xinglin ; Hao, Yanni ; Dong, Yao ; Wang, Yigang ; Liu, Ying ; Jiang, Yongyu ; Wang, Yungan ; Wang, Yuqi ; Wang, Zhangsheng ; Yu, Zhaoxin ; Luo, Zhen ; Mao, Wenji ; Wang, Lei ; Zeng, Dajun</creator><creatorcontrib>Luo, Yin ; Kong, Qingchao ; Xu, Nan ; Cao, Jia ; Hao, Bao ; Qu, Baoyu ; Chen, Bo ; Zhu, Chao ; Zhao, Chenyang ; Zhang, Donglei ; Feng, Fan ; Zhao, Feifei ; Sun, Hailong ; Yang, Hanxuan ; Pan, Haojun ; Liu, Hongyu ; Guo, Jianbin ; Du, Jiangtao ; Wang, Jingyi ; Li, Junfeng ; Sun, Lei ; Liu, Liduo ; Dong, Lifeng ; Liu, Lili ; Wang, Lin ; Zhang, Liwen ; Wang, Minzheng ; Wang, Pin ; Yu, Ping ; Li, Qingxiao ; Yan, Rui ; Zou, Rui ; Li, Ruiqun ; Huang, Taiwen ; Wang, Xiaodong ; Wu, Xiaofei ; Peng, Xin ; Zhang, Xina ; Fang, Xing ; Xiao, Xinglin ; Hao, Yanni ; Dong, Yao ; Wang, Yigang ; Liu, Ying ; Jiang, Yongyu ; Wang, Yungan ; Wang, Yuqi ; Wang, Zhangsheng ; Yu, Zhaoxin ; Luo, Zhen ; Mao, Wenji ; Wang, Lei ; Zeng, Dajun</creatorcontrib><description>As the latest advancements in natural language processing, large language models (LLMs) have achieved human-level language understanding and generation abilities in many real-world tasks, and even have been regarded as a potential path to the artificial general intelligence. To better facilitate research on LLMs, many open-source LLMs, such as Llama 2 and Falcon, have recently been proposed and gained comparable performances to proprietary models. However, these models are primarily designed for English scenarios and exhibit poor performances in Chinese contexts. In this technical report, we propose YAYI 2, including both base and chat models, with 30 billion parameters. YAYI 2 is pre-trained from scratch on a multilingual corpus which contains 2.65 trillion tokens filtered by our pre-training data processing pipeline. The base model is aligned with human values through supervised fine-tuning with millions of instructions and reinforcement learning from human feedback. Extensive experiments on multiple benchmarks, such as MMLU and CMMLU, consistently demonstrate that the proposed YAYI 2 outperforms other similar sized open-source models.</description><identifier>DOI: 10.48550/arxiv.2312.14862</identifier><language>eng</language><subject>Computer Science - Artificial Intelligence ; Computer Science - Computation and Language</subject><creationdate>2023-12</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2312.14862$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2312.14862$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Luo, Yin</creatorcontrib><creatorcontrib>Kong, Qingchao</creatorcontrib><creatorcontrib>Xu, Nan</creatorcontrib><creatorcontrib>Cao, Jia</creatorcontrib><creatorcontrib>Hao, Bao</creatorcontrib><creatorcontrib>Qu, Baoyu</creatorcontrib><creatorcontrib>Chen, Bo</creatorcontrib><creatorcontrib>Zhu, Chao</creatorcontrib><creatorcontrib>Zhao, Chenyang</creatorcontrib><creatorcontrib>Zhang, Donglei</creatorcontrib><creatorcontrib>Feng, Fan</creatorcontrib><creatorcontrib>Zhao, Feifei</creatorcontrib><creatorcontrib>Sun, Hailong</creatorcontrib><creatorcontrib>Yang, Hanxuan</creatorcontrib><creatorcontrib>Pan, Haojun</creatorcontrib><creatorcontrib>Liu, Hongyu</creatorcontrib><creatorcontrib>Guo, Jianbin</creatorcontrib><creatorcontrib>Du, Jiangtao</creatorcontrib><creatorcontrib>Wang, Jingyi</creatorcontrib><creatorcontrib>Li, Junfeng</creatorcontrib><creatorcontrib>Sun, Lei</creatorcontrib><creatorcontrib>Liu, Liduo</creatorcontrib><creatorcontrib>Dong, Lifeng</creatorcontrib><creatorcontrib>Liu, Lili</creatorcontrib><creatorcontrib>Wang, Lin</creatorcontrib><creatorcontrib>Zhang, Liwen</creatorcontrib><creatorcontrib>Wang, Minzheng</creatorcontrib><creatorcontrib>Wang, Pin</creatorcontrib><creatorcontrib>Yu, Ping</creatorcontrib><creatorcontrib>Li, Qingxiao</creatorcontrib><creatorcontrib>Yan, Rui</creatorcontrib><creatorcontrib>Zou, Rui</creatorcontrib><creatorcontrib>Li, Ruiqun</creatorcontrib><creatorcontrib>Huang, Taiwen</creatorcontrib><creatorcontrib>Wang, Xiaodong</creatorcontrib><creatorcontrib>Wu, Xiaofei</creatorcontrib><creatorcontrib>Peng, Xin</creatorcontrib><creatorcontrib>Zhang, Xina</creatorcontrib><creatorcontrib>Fang, Xing</creatorcontrib><creatorcontrib>Xiao, Xinglin</creatorcontrib><creatorcontrib>Hao, Yanni</creatorcontrib><creatorcontrib>Dong, Yao</creatorcontrib><creatorcontrib>Wang, Yigang</creatorcontrib><creatorcontrib>Liu, Ying</creatorcontrib><creatorcontrib>Jiang, Yongyu</creatorcontrib><creatorcontrib>Wang, Yungan</creatorcontrib><creatorcontrib>Wang, Yuqi</creatorcontrib><creatorcontrib>Wang, Zhangsheng</creatorcontrib><creatorcontrib>Yu, Zhaoxin</creatorcontrib><creatorcontrib>Luo, Zhen</creatorcontrib><creatorcontrib>Mao, Wenji</creatorcontrib><creatorcontrib>Wang, Lei</creatorcontrib><creatorcontrib>Zeng, Dajun</creatorcontrib><title>YAYI 2: Multilingual Open-Source Large Language Models</title><description>As the latest advancements in natural language processing, large language models (LLMs) have achieved human-level language understanding and generation abilities in many real-world tasks, and even have been regarded as a potential path to the artificial general intelligence. To better facilitate research on LLMs, many open-source LLMs, such as Llama 2 and Falcon, have recently been proposed and gained comparable performances to proprietary models. However, these models are primarily designed for English scenarios and exhibit poor performances in Chinese contexts. In this technical report, we propose YAYI 2, including both base and chat models, with 30 billion parameters. YAYI 2 is pre-trained from scratch on a multilingual corpus which contains 2.65 trillion tokens filtered by our pre-training data processing pipeline. The base model is aligned with human values through supervised fine-tuning with millions of instructions and reinforcement learning from human feedback. Extensive experiments on multiple benchmarks, such as MMLU and CMMLU, consistently demonstrate that the proposed YAYI 2 outperforms other similar sized open-source models.</description><subject>Computer Science - Artificial Intelligence</subject><subject>Computer Science - Computation and Language</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotT0uLwjAYzMWD6P4AT9s_0JpnE72JrA-oeFgvnsqXfIkU4oO4lfXfa1UGZgYGhhlCRowW0ihFx5D-m1vBBeMFk6bkfVLuZ_t1xqfZpo1_TWxOhxZitr34U_57bpPzWQXp0HGXPM3mjD5eh6QXIF7910cHZLf42c1XebVdruezKodS89xRED5YKRW6ILWQAnUQ1EpaesNBIBjtUFlP-cQ8ARgUMrTUWaSacTEg3-_a1_D6kpojpHvdHahfB8QDQ68_bQ</recordid><startdate>20231222</startdate><enddate>20231222</enddate><creator>Luo, Yin</creator><creator>Kong, Qingchao</creator><creator>Xu, Nan</creator><creator>Cao, Jia</creator><creator>Hao, Bao</creator><creator>Qu, Baoyu</creator><creator>Chen, Bo</creator><creator>Zhu, Chao</creator><creator>Zhao, Chenyang</creator><creator>Zhang, Donglei</creator><creator>Feng, Fan</creator><creator>Zhao, Feifei</creator><creator>Sun, Hailong</creator><creator>Yang, Hanxuan</creator><creator>Pan, Haojun</creator><creator>Liu, Hongyu</creator><creator>Guo, Jianbin</creator><creator>Du, Jiangtao</creator><creator>Wang, Jingyi</creator><creator>Li, Junfeng</creator><creator>Sun, Lei</creator><creator>Liu, Liduo</creator><creator>Dong, Lifeng</creator><creator>Liu, Lili</creator><creator>Wang, Lin</creator><creator>Zhang, Liwen</creator><creator>Wang, Minzheng</creator><creator>Wang, Pin</creator><creator>Yu, Ping</creator><creator>Li, Qingxiao</creator><creator>Yan, Rui</creator><creator>Zou, Rui</creator><creator>Li, Ruiqun</creator><creator>Huang, Taiwen</creator><creator>Wang, Xiaodong</creator><creator>Wu, Xiaofei</creator><creator>Peng, Xin</creator><creator>Zhang, Xina</creator><creator>Fang, Xing</creator><creator>Xiao, Xinglin</creator><creator>Hao, Yanni</creator><creator>Dong, Yao</creator><creator>Wang, Yigang</creator><creator>Liu, Ying</creator><creator>Jiang, Yongyu</creator><creator>Wang, Yungan</creator><creator>Wang, Yuqi</creator><creator>Wang, Zhangsheng</creator><creator>Yu, Zhaoxin</creator><creator>Luo, Zhen</creator><creator>Mao, Wenji</creator><creator>Wang, Lei</creator><creator>Zeng, Dajun</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20231222</creationdate><title>YAYI 2: Multilingual Open-Source Large Language Models</title><author>Luo, Yin ; Kong, Qingchao ; Xu, Nan ; Cao, Jia ; Hao, Bao ; Qu, Baoyu ; Chen, Bo ; Zhu, Chao ; Zhao, Chenyang ; Zhang, Donglei ; Feng, Fan ; Zhao, Feifei ; Sun, Hailong ; Yang, Hanxuan ; Pan, Haojun ; Liu, Hongyu ; Guo, Jianbin ; Du, Jiangtao ; Wang, Jingyi ; Li, Junfeng ; Sun, Lei ; Liu, Liduo ; Dong, Lifeng ; Liu, Lili ; Wang, Lin ; Zhang, Liwen ; Wang, Minzheng ; Wang, Pin ; Yu, Ping ; Li, Qingxiao ; Yan, Rui ; Zou, Rui ; Li, Ruiqun ; Huang, Taiwen ; Wang, Xiaodong ; Wu, Xiaofei ; Peng, Xin ; Zhang, Xina ; Fang, Xing ; Xiao, Xinglin ; Hao, Yanni ; Dong, Yao ; Wang, Yigang ; Liu, Ying ; Jiang, Yongyu ; Wang, Yungan ; Wang, Yuqi ; Wang, Zhangsheng ; Yu, Zhaoxin ; Luo, Zhen ; Mao, Wenji ; Wang, Lei ; Zeng, Dajun</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a672-c0a3efb445dcf47343d7f30b406e82a3da87cd5be0298989adf5d1db0cbd07123</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Computer Science - Artificial Intelligence</topic><topic>Computer Science - Computation and Language</topic><toplevel>online_resources</toplevel><creatorcontrib>Luo, Yin</creatorcontrib><creatorcontrib>Kong, Qingchao</creatorcontrib><creatorcontrib>Xu, Nan</creatorcontrib><creatorcontrib>Cao, Jia</creatorcontrib><creatorcontrib>Hao, Bao</creatorcontrib><creatorcontrib>Qu, Baoyu</creatorcontrib><creatorcontrib>Chen, Bo</creatorcontrib><creatorcontrib>Zhu, Chao</creatorcontrib><creatorcontrib>Zhao, Chenyang</creatorcontrib><creatorcontrib>Zhang, Donglei</creatorcontrib><creatorcontrib>Feng, Fan</creatorcontrib><creatorcontrib>Zhao, Feifei</creatorcontrib><creatorcontrib>Sun, Hailong</creatorcontrib><creatorcontrib>Yang, Hanxuan</creatorcontrib><creatorcontrib>Pan, Haojun</creatorcontrib><creatorcontrib>Liu, Hongyu</creatorcontrib><creatorcontrib>Guo, Jianbin</creatorcontrib><creatorcontrib>Du, Jiangtao</creatorcontrib><creatorcontrib>Wang, Jingyi</creatorcontrib><creatorcontrib>Li, Junfeng</creatorcontrib><creatorcontrib>Sun, Lei</creatorcontrib><creatorcontrib>Liu, Liduo</creatorcontrib><creatorcontrib>Dong, Lifeng</creatorcontrib><creatorcontrib>Liu, Lili</creatorcontrib><creatorcontrib>Wang, Lin</creatorcontrib><creatorcontrib>Zhang, Liwen</creatorcontrib><creatorcontrib>Wang, Minzheng</creatorcontrib><creatorcontrib>Wang, Pin</creatorcontrib><creatorcontrib>Yu, Ping</creatorcontrib><creatorcontrib>Li, Qingxiao</creatorcontrib><creatorcontrib>Yan, Rui</creatorcontrib><creatorcontrib>Zou, Rui</creatorcontrib><creatorcontrib>Li, Ruiqun</creatorcontrib><creatorcontrib>Huang, Taiwen</creatorcontrib><creatorcontrib>Wang, Xiaodong</creatorcontrib><creatorcontrib>Wu, Xiaofei</creatorcontrib><creatorcontrib>Peng, Xin</creatorcontrib><creatorcontrib>Zhang, Xina</creatorcontrib><creatorcontrib>Fang, Xing</creatorcontrib><creatorcontrib>Xiao, Xinglin</creatorcontrib><creatorcontrib>Hao, Yanni</creatorcontrib><creatorcontrib>Dong, Yao</creatorcontrib><creatorcontrib>Wang, Yigang</creatorcontrib><creatorcontrib>Liu, Ying</creatorcontrib><creatorcontrib>Jiang, Yongyu</creatorcontrib><creatorcontrib>Wang, Yungan</creatorcontrib><creatorcontrib>Wang, Yuqi</creatorcontrib><creatorcontrib>Wang, Zhangsheng</creatorcontrib><creatorcontrib>Yu, Zhaoxin</creatorcontrib><creatorcontrib>Luo, Zhen</creatorcontrib><creatorcontrib>Mao, Wenji</creatorcontrib><creatorcontrib>Wang, Lei</creatorcontrib><creatorcontrib>Zeng, Dajun</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Luo, Yin</au><au>Kong, Qingchao</au><au>Xu, Nan</au><au>Cao, Jia</au><au>Hao, Bao</au><au>Qu, Baoyu</au><au>Chen, Bo</au><au>Zhu, Chao</au><au>Zhao, Chenyang</au><au>Zhang, Donglei</au><au>Feng, Fan</au><au>Zhao, Feifei</au><au>Sun, Hailong</au><au>Yang, Hanxuan</au><au>Pan, Haojun</au><au>Liu, Hongyu</au><au>Guo, Jianbin</au><au>Du, Jiangtao</au><au>Wang, Jingyi</au><au>Li, Junfeng</au><au>Sun, Lei</au><au>Liu, Liduo</au><au>Dong, Lifeng</au><au>Liu, Lili</au><au>Wang, Lin</au><au>Zhang, Liwen</au><au>Wang, Minzheng</au><au>Wang, Pin</au><au>Yu, Ping</au><au>Li, Qingxiao</au><au>Yan, Rui</au><au>Zou, Rui</au><au>Li, Ruiqun</au><au>Huang, Taiwen</au><au>Wang, Xiaodong</au><au>Wu, Xiaofei</au><au>Peng, Xin</au><au>Zhang, Xina</au><au>Fang, Xing</au><au>Xiao, Xinglin</au><au>Hao, Yanni</au><au>Dong, Yao</au><au>Wang, Yigang</au><au>Liu, Ying</au><au>Jiang, Yongyu</au><au>Wang, Yungan</au><au>Wang, Yuqi</au><au>Wang, Zhangsheng</au><au>Yu, Zhaoxin</au><au>Luo, Zhen</au><au>Mao, Wenji</au><au>Wang, Lei</au><au>Zeng, Dajun</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>YAYI 2: Multilingual Open-Source Large Language Models</atitle><date>2023-12-22</date><risdate>2023</risdate><abstract>As the latest advancements in natural language processing, large language models (LLMs) have achieved human-level language understanding and generation abilities in many real-world tasks, and even have been regarded as a potential path to the artificial general intelligence. To better facilitate research on LLMs, many open-source LLMs, such as Llama 2 and Falcon, have recently been proposed and gained comparable performances to proprietary models. However, these models are primarily designed for English scenarios and exhibit poor performances in Chinese contexts. In this technical report, we propose YAYI 2, including both base and chat models, with 30 billion parameters. YAYI 2 is pre-trained from scratch on a multilingual corpus which contains 2.65 trillion tokens filtered by our pre-training data processing pipeline. The base model is aligned with human values through supervised fine-tuning with millions of instructions and reinforcement learning from human feedback. Extensive experiments on multiple benchmarks, such as MMLU and CMMLU, consistently demonstrate that the proposed YAYI 2 outperforms other similar sized open-source models.</abstract><doi>10.48550/arxiv.2312.14862</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2312.14862
ispartof
issn
language eng
recordid cdi_arxiv_primary_2312_14862
source arXiv.org
subjects Computer Science - Artificial Intelligence
Computer Science - Computation and Language
title YAYI 2: Multilingual Open-Source Large Language Models
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-08T10%3A27%3A08IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=YAYI%202:%20Multilingual%20Open-Source%20Large%20Language%20Models&rft.au=Luo,%20Yin&rft.date=2023-12-22&rft_id=info:doi/10.48550/arxiv.2312.14862&rft_dat=%3Carxiv_GOX%3E2312_14862%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true