Yi-Lightning Technical Report
This technical report presents Yi-Lightning, our latest flagship large language model (LLM). It achieves exceptional performance, ranking 6th overall on Chatbot Arena, with particularly strong results (2nd to 4th place) in specialized categories including Chinese, Math, Coding, and Hard Prompts. Yi-...
Gespeichert in:
Hauptverfasser: | , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Wake, Alan Chen, Bei Lv, C. X Li, Chao Huang, Chengen Cai, Chenglin Zheng, Chujie Cooper, Daniel Zhou, Fan Hu, Feng Zhang, Ge Wang, Guoyin Ji, Heng Qiu, Howard Zhu, Jiangcheng Tian, Jun Su, Katherine Zhang, Lihuan Li, Liying Song, Ming Li, Mou Liu, Peng Hu, Qicheng Wang, Shawn Zhou, Shijun Yang, Shiming Li, Shiyong Zhu, Tianhang Xie, Wen Huang, Wenhao He, Xiang Chen, Xiaobo Hu, Xiaohui Ren, Xiaoyi Niu, Xinyao Li, Yanpeng Zhao, Yongke Luo, Yongzhen Xu, Yuchi Sha, Yuxuan Yan, Zhaodong Liu, Zhiyuan Zhang, Zirui Dai, Zonghong |
description | This technical report presents Yi-Lightning, our latest flagship large
language model (LLM). It achieves exceptional performance, ranking 6th overall
on Chatbot Arena, with particularly strong results (2nd to 4th place) in
specialized categories including Chinese, Math, Coding, and Hard Prompts.
Yi-Lightning leverages an enhanced Mixture-of-Experts (MoE) architecture,
featuring advanced expert segmentation and routing mechanisms coupled with
optimized KV-caching techniques. Our development process encompasses
comprehensive pre-training, supervised fine-tuning (SFT), and reinforcement
learning from human feedback (RLHF), where we devise deliberate strategies for
multi-stage training, synthetic data construction, and reward modeling.
Furthermore, we implement RAISE (Responsible AI Safety Engine), a
four-component framework to address safety issues across pre-training,
post-training, and serving phases. Empowered by our scalable super-computing
infrastructure, all these innovations substantially reduce training, deployment
and inference costs while maintaining high-performance standards. With further
evaluations on public academic benchmarks, Yi-Lightning demonstrates
competitive performance against top-tier LLMs, while we observe a notable
disparity between traditional, static benchmark results and real-world, dynamic
human preferences. This observation prompts a critical reassessment of
conventional benchmarks' utility in guiding the development of more intelligent
and powerful AI systems for practical applications. Yi-Lightning is now
available through our developer platform at https://platform.lingyiwanwu.com. |
doi_str_mv | 10.48550/arxiv.2412.01253 |
format | Article |
fullrecord | <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2412_01253</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2412_01253</sourcerecordid><originalsourceid>FETCH-arxiv_primary_2412_012533</originalsourceid><addsrcrecordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjE00jMwNDI15mSQjczU9clMzyjJy8xLVwhJTc7Iy0xOzFEISi3ILyrhYWBNS8wpTuWF0twM8m6uIc4eumCD4guKMnMTiyrjQQbGgw00JqwCAEw2KWc</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Yi-Lightning Technical Report</title><source>arXiv.org</source><creator>Wake, Alan ; Chen, Bei ; Lv, C. X ; Li, Chao ; Huang, Chengen ; Cai, Chenglin ; Zheng, Chujie ; Cooper, Daniel ; Zhou, Fan ; Hu, Feng ; Zhang, Ge ; Wang, Guoyin ; Ji, Heng ; Qiu, Howard ; Zhu, Jiangcheng ; Tian, Jun ; Su, Katherine ; Zhang, Lihuan ; Li, Liying ; Song, Ming ; Li, Mou ; Liu, Peng ; Hu, Qicheng ; Wang, Shawn ; Zhou, Shijun ; Yang, Shiming ; Li, Shiyong ; Zhu, Tianhang ; Xie, Wen ; Huang, Wenhao ; He, Xiang ; Chen, Xiaobo ; Hu, Xiaohui ; Ren, Xiaoyi ; Niu, Xinyao ; Li, Yanpeng ; Zhao, Yongke ; Luo, Yongzhen ; Xu, Yuchi ; Sha, Yuxuan ; Yan, Zhaodong ; Liu, Zhiyuan ; Zhang, Zirui ; Dai, Zonghong</creator><creatorcontrib>Wake, Alan ; Chen, Bei ; Lv, C. X ; Li, Chao ; Huang, Chengen ; Cai, Chenglin ; Zheng, Chujie ; Cooper, Daniel ; Zhou, Fan ; Hu, Feng ; Zhang, Ge ; Wang, Guoyin ; Ji, Heng ; Qiu, Howard ; Zhu, Jiangcheng ; Tian, Jun ; Su, Katherine ; Zhang, Lihuan ; Li, Liying ; Song, Ming ; Li, Mou ; Liu, Peng ; Hu, Qicheng ; Wang, Shawn ; Zhou, Shijun ; Yang, Shiming ; Li, Shiyong ; Zhu, Tianhang ; Xie, Wen ; Huang, Wenhao ; He, Xiang ; Chen, Xiaobo ; Hu, Xiaohui ; Ren, Xiaoyi ; Niu, Xinyao ; Li, Yanpeng ; Zhao, Yongke ; Luo, Yongzhen ; Xu, Yuchi ; Sha, Yuxuan ; Yan, Zhaodong ; Liu, Zhiyuan ; Zhang, Zirui ; Dai, Zonghong</creatorcontrib><description>This technical report presents Yi-Lightning, our latest flagship large
language model (LLM). It achieves exceptional performance, ranking 6th overall
on Chatbot Arena, with particularly strong results (2nd to 4th place) in
specialized categories including Chinese, Math, Coding, and Hard Prompts.
Yi-Lightning leverages an enhanced Mixture-of-Experts (MoE) architecture,
featuring advanced expert segmentation and routing mechanisms coupled with
optimized KV-caching techniques. Our development process encompasses
comprehensive pre-training, supervised fine-tuning (SFT), and reinforcement
learning from human feedback (RLHF), where we devise deliberate strategies for
multi-stage training, synthetic data construction, and reward modeling.
Furthermore, we implement RAISE (Responsible AI Safety Engine), a
four-component framework to address safety issues across pre-training,
post-training, and serving phases. Empowered by our scalable super-computing
infrastructure, all these innovations substantially reduce training, deployment
and inference costs while maintaining high-performance standards. With further
evaluations on public academic benchmarks, Yi-Lightning demonstrates
competitive performance against top-tier LLMs, while we observe a notable
disparity between traditional, static benchmark results and real-world, dynamic
human preferences. This observation prompts a critical reassessment of
conventional benchmarks' utility in guiding the development of more intelligent
and powerful AI systems for practical applications. Yi-Lightning is now
available through our developer platform at https://platform.lingyiwanwu.com.</description><identifier>DOI: 10.48550/arxiv.2412.01253</identifier><language>eng</language><subject>Computer Science - Artificial Intelligence ; Computer Science - Computation and Language ; Computer Science - Learning</subject><creationdate>2024-12</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2412.01253$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2412.01253$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Wake, Alan</creatorcontrib><creatorcontrib>Chen, Bei</creatorcontrib><creatorcontrib>Lv, C. X</creatorcontrib><creatorcontrib>Li, Chao</creatorcontrib><creatorcontrib>Huang, Chengen</creatorcontrib><creatorcontrib>Cai, Chenglin</creatorcontrib><creatorcontrib>Zheng, Chujie</creatorcontrib><creatorcontrib>Cooper, Daniel</creatorcontrib><creatorcontrib>Zhou, Fan</creatorcontrib><creatorcontrib>Hu, Feng</creatorcontrib><creatorcontrib>Zhang, Ge</creatorcontrib><creatorcontrib>Wang, Guoyin</creatorcontrib><creatorcontrib>Ji, Heng</creatorcontrib><creatorcontrib>Qiu, Howard</creatorcontrib><creatorcontrib>Zhu, Jiangcheng</creatorcontrib><creatorcontrib>Tian, Jun</creatorcontrib><creatorcontrib>Su, Katherine</creatorcontrib><creatorcontrib>Zhang, Lihuan</creatorcontrib><creatorcontrib>Li, Liying</creatorcontrib><creatorcontrib>Song, Ming</creatorcontrib><creatorcontrib>Li, Mou</creatorcontrib><creatorcontrib>Liu, Peng</creatorcontrib><creatorcontrib>Hu, Qicheng</creatorcontrib><creatorcontrib>Wang, Shawn</creatorcontrib><creatorcontrib>Zhou, Shijun</creatorcontrib><creatorcontrib>Yang, Shiming</creatorcontrib><creatorcontrib>Li, Shiyong</creatorcontrib><creatorcontrib>Zhu, Tianhang</creatorcontrib><creatorcontrib>Xie, Wen</creatorcontrib><creatorcontrib>Huang, Wenhao</creatorcontrib><creatorcontrib>He, Xiang</creatorcontrib><creatorcontrib>Chen, Xiaobo</creatorcontrib><creatorcontrib>Hu, Xiaohui</creatorcontrib><creatorcontrib>Ren, Xiaoyi</creatorcontrib><creatorcontrib>Niu, Xinyao</creatorcontrib><creatorcontrib>Li, Yanpeng</creatorcontrib><creatorcontrib>Zhao, Yongke</creatorcontrib><creatorcontrib>Luo, Yongzhen</creatorcontrib><creatorcontrib>Xu, Yuchi</creatorcontrib><creatorcontrib>Sha, Yuxuan</creatorcontrib><creatorcontrib>Yan, Zhaodong</creatorcontrib><creatorcontrib>Liu, Zhiyuan</creatorcontrib><creatorcontrib>Zhang, Zirui</creatorcontrib><creatorcontrib>Dai, Zonghong</creatorcontrib><title>Yi-Lightning Technical Report</title><description>This technical report presents Yi-Lightning, our latest flagship large
language model (LLM). It achieves exceptional performance, ranking 6th overall
on Chatbot Arena, with particularly strong results (2nd to 4th place) in
specialized categories including Chinese, Math, Coding, and Hard Prompts.
Yi-Lightning leverages an enhanced Mixture-of-Experts (MoE) architecture,
featuring advanced expert segmentation and routing mechanisms coupled with
optimized KV-caching techniques. Our development process encompasses
comprehensive pre-training, supervised fine-tuning (SFT), and reinforcement
learning from human feedback (RLHF), where we devise deliberate strategies for
multi-stage training, synthetic data construction, and reward modeling.
Furthermore, we implement RAISE (Responsible AI Safety Engine), a
four-component framework to address safety issues across pre-training,
post-training, and serving phases. Empowered by our scalable super-computing
infrastructure, all these innovations substantially reduce training, deployment
and inference costs while maintaining high-performance standards. With further
evaluations on public academic benchmarks, Yi-Lightning demonstrates
competitive performance against top-tier LLMs, while we observe a notable
disparity between traditional, static benchmark results and real-world, dynamic
human preferences. This observation prompts a critical reassessment of
conventional benchmarks' utility in guiding the development of more intelligent
and powerful AI systems for practical applications. Yi-Lightning is now
available through our developer platform at https://platform.lingyiwanwu.com.</description><subject>Computer Science - Artificial Intelligence</subject><subject>Computer Science - Computation and Language</subject><subject>Computer Science - Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjE00jMwNDI15mSQjczU9clMzyjJy8xLVwhJTc7Iy0xOzFEISi3ILyrhYWBNS8wpTuWF0twM8m6uIc4eumCD4guKMnMTiyrjQQbGgw00JqwCAEw2KWc</recordid><startdate>20241202</startdate><enddate>20241202</enddate><creator>Wake, Alan</creator><creator>Chen, Bei</creator><creator>Lv, C. X</creator><creator>Li, Chao</creator><creator>Huang, Chengen</creator><creator>Cai, Chenglin</creator><creator>Zheng, Chujie</creator><creator>Cooper, Daniel</creator><creator>Zhou, Fan</creator><creator>Hu, Feng</creator><creator>Zhang, Ge</creator><creator>Wang, Guoyin</creator><creator>Ji, Heng</creator><creator>Qiu, Howard</creator><creator>Zhu, Jiangcheng</creator><creator>Tian, Jun</creator><creator>Su, Katherine</creator><creator>Zhang, Lihuan</creator><creator>Li, Liying</creator><creator>Song, Ming</creator><creator>Li, Mou</creator><creator>Liu, Peng</creator><creator>Hu, Qicheng</creator><creator>Wang, Shawn</creator><creator>Zhou, Shijun</creator><creator>Yang, Shiming</creator><creator>Li, Shiyong</creator><creator>Zhu, Tianhang</creator><creator>Xie, Wen</creator><creator>Huang, Wenhao</creator><creator>He, Xiang</creator><creator>Chen, Xiaobo</creator><creator>Hu, Xiaohui</creator><creator>Ren, Xiaoyi</creator><creator>Niu, Xinyao</creator><creator>Li, Yanpeng</creator><creator>Zhao, Yongke</creator><creator>Luo, Yongzhen</creator><creator>Xu, Yuchi</creator><creator>Sha, Yuxuan</creator><creator>Yan, Zhaodong</creator><creator>Liu, Zhiyuan</creator><creator>Zhang, Zirui</creator><creator>Dai, Zonghong</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20241202</creationdate><title>Yi-Lightning Technical Report</title><author>Wake, Alan ; Chen, Bei ; Lv, C. X ; Li, Chao ; Huang, Chengen ; Cai, Chenglin ; Zheng, Chujie ; Cooper, Daniel ; Zhou, Fan ; Hu, Feng ; Zhang, Ge ; Wang, Guoyin ; Ji, Heng ; Qiu, Howard ; Zhu, Jiangcheng ; Tian, Jun ; Su, Katherine ; Zhang, Lihuan ; Li, Liying ; Song, Ming ; Li, Mou ; Liu, Peng ; Hu, Qicheng ; Wang, Shawn ; Zhou, Shijun ; Yang, Shiming ; Li, Shiyong ; Zhu, Tianhang ; Xie, Wen ; Huang, Wenhao ; He, Xiang ; Chen, Xiaobo ; Hu, Xiaohui ; Ren, Xiaoyi ; Niu, Xinyao ; Li, Yanpeng ; Zhao, Yongke ; Luo, Yongzhen ; Xu, Yuchi ; Sha, Yuxuan ; Yan, Zhaodong ; Liu, Zhiyuan ; Zhang, Zirui ; Dai, Zonghong</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-arxiv_primary_2412_012533</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Artificial Intelligence</topic><topic>Computer Science - Computation and Language</topic><topic>Computer Science - Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Wake, Alan</creatorcontrib><creatorcontrib>Chen, Bei</creatorcontrib><creatorcontrib>Lv, C. X</creatorcontrib><creatorcontrib>Li, Chao</creatorcontrib><creatorcontrib>Huang, Chengen</creatorcontrib><creatorcontrib>Cai, Chenglin</creatorcontrib><creatorcontrib>Zheng, Chujie</creatorcontrib><creatorcontrib>Cooper, Daniel</creatorcontrib><creatorcontrib>Zhou, Fan</creatorcontrib><creatorcontrib>Hu, Feng</creatorcontrib><creatorcontrib>Zhang, Ge</creatorcontrib><creatorcontrib>Wang, Guoyin</creatorcontrib><creatorcontrib>Ji, Heng</creatorcontrib><creatorcontrib>Qiu, Howard</creatorcontrib><creatorcontrib>Zhu, Jiangcheng</creatorcontrib><creatorcontrib>Tian, Jun</creatorcontrib><creatorcontrib>Su, Katherine</creatorcontrib><creatorcontrib>Zhang, Lihuan</creatorcontrib><creatorcontrib>Li, Liying</creatorcontrib><creatorcontrib>Song, Ming</creatorcontrib><creatorcontrib>Li, Mou</creatorcontrib><creatorcontrib>Liu, Peng</creatorcontrib><creatorcontrib>Hu, Qicheng</creatorcontrib><creatorcontrib>Wang, Shawn</creatorcontrib><creatorcontrib>Zhou, Shijun</creatorcontrib><creatorcontrib>Yang, Shiming</creatorcontrib><creatorcontrib>Li, Shiyong</creatorcontrib><creatorcontrib>Zhu, Tianhang</creatorcontrib><creatorcontrib>Xie, Wen</creatorcontrib><creatorcontrib>Huang, Wenhao</creatorcontrib><creatorcontrib>He, Xiang</creatorcontrib><creatorcontrib>Chen, Xiaobo</creatorcontrib><creatorcontrib>Hu, Xiaohui</creatorcontrib><creatorcontrib>Ren, Xiaoyi</creatorcontrib><creatorcontrib>Niu, Xinyao</creatorcontrib><creatorcontrib>Li, Yanpeng</creatorcontrib><creatorcontrib>Zhao, Yongke</creatorcontrib><creatorcontrib>Luo, Yongzhen</creatorcontrib><creatorcontrib>Xu, Yuchi</creatorcontrib><creatorcontrib>Sha, Yuxuan</creatorcontrib><creatorcontrib>Yan, Zhaodong</creatorcontrib><creatorcontrib>Liu, Zhiyuan</creatorcontrib><creatorcontrib>Zhang, Zirui</creatorcontrib><creatorcontrib>Dai, Zonghong</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Wake, Alan</au><au>Chen, Bei</au><au>Lv, C. X</au><au>Li, Chao</au><au>Huang, Chengen</au><au>Cai, Chenglin</au><au>Zheng, Chujie</au><au>Cooper, Daniel</au><au>Zhou, Fan</au><au>Hu, Feng</au><au>Zhang, Ge</au><au>Wang, Guoyin</au><au>Ji, Heng</au><au>Qiu, Howard</au><au>Zhu, Jiangcheng</au><au>Tian, Jun</au><au>Su, Katherine</au><au>Zhang, Lihuan</au><au>Li, Liying</au><au>Song, Ming</au><au>Li, Mou</au><au>Liu, Peng</au><au>Hu, Qicheng</au><au>Wang, Shawn</au><au>Zhou, Shijun</au><au>Yang, Shiming</au><au>Li, Shiyong</au><au>Zhu, Tianhang</au><au>Xie, Wen</au><au>Huang, Wenhao</au><au>He, Xiang</au><au>Chen, Xiaobo</au><au>Hu, Xiaohui</au><au>Ren, Xiaoyi</au><au>Niu, Xinyao</au><au>Li, Yanpeng</au><au>Zhao, Yongke</au><au>Luo, Yongzhen</au><au>Xu, Yuchi</au><au>Sha, Yuxuan</au><au>Yan, Zhaodong</au><au>Liu, Zhiyuan</au><au>Zhang, Zirui</au><au>Dai, Zonghong</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Yi-Lightning Technical Report</atitle><date>2024-12-02</date><risdate>2024</risdate><abstract>This technical report presents Yi-Lightning, our latest flagship large
language model (LLM). It achieves exceptional performance, ranking 6th overall
on Chatbot Arena, with particularly strong results (2nd to 4th place) in
specialized categories including Chinese, Math, Coding, and Hard Prompts.
Yi-Lightning leverages an enhanced Mixture-of-Experts (MoE) architecture,
featuring advanced expert segmentation and routing mechanisms coupled with
optimized KV-caching techniques. Our development process encompasses
comprehensive pre-training, supervised fine-tuning (SFT), and reinforcement
learning from human feedback (RLHF), where we devise deliberate strategies for
multi-stage training, synthetic data construction, and reward modeling.
Furthermore, we implement RAISE (Responsible AI Safety Engine), a
four-component framework to address safety issues across pre-training,
post-training, and serving phases. Empowered by our scalable super-computing
infrastructure, all these innovations substantially reduce training, deployment
and inference costs while maintaining high-performance standards. With further
evaluations on public academic benchmarks, Yi-Lightning demonstrates
competitive performance against top-tier LLMs, while we observe a notable
disparity between traditional, static benchmark results and real-world, dynamic
human preferences. This observation prompts a critical reassessment of
conventional benchmarks' utility in guiding the development of more intelligent
and powerful AI systems for practical applications. Yi-Lightning is now
available through our developer platform at https://platform.lingyiwanwu.com.</abstract><doi>10.48550/arxiv.2412.01253</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | DOI: 10.48550/arxiv.2412.01253 |
ispartof | |
issn | |
language | eng |
recordid | cdi_arxiv_primary_2412_01253 |
source | arXiv.org |
subjects | Computer Science - Artificial Intelligence Computer Science - Computation and Language Computer Science - Learning |
title | Yi-Lightning Technical Report |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-03T02%3A30%3A25IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Yi-Lightning%20Technical%20Report&rft.au=Wake,%20Alan&rft.date=2024-12-02&rft_id=info:doi/10.48550/arxiv.2412.01253&rft_dat=%3Carxiv_GOX%3E2412_01253%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |