HMoE: Heterogeneous Mixture of Experts for Language Modeling

Mixture of Experts (MoE) offers remarkable performance and computational efficiency by selectively activating subsets of model parameters. Traditionally, MoE models use homogeneous experts, each with identical capacity. However, varying complexity in input data necessitates experts with diverse capa...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2024-08
Hauptverfasser: Wang, An, Sun, Xingwu, Xie, Ruobing, Li, Shuaipeng, Zhu, Jiaqi, Yang, Zhen, Zhao, Pinxue, Han, J N, Kang, Zhanhui, Wang, Di, Okazaki, Naoaki, Cheng-zhong, Xu
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!