Picture for Xinglin Pan Wenxiang Lin

Xinglin Pan Wenxiang Lin

Parm: Efficient Training of Large Sparsely-Activated Models with Dedicated Schedules

Add code
Jun 30, 2024
Viaarxiv icon