Picture for Zihan Qiu

Zihan Qiu

Video Repurposing from User Generated Content: A Large-scale Dataset and Benchmark

Add code
Dec 12, 2024
Viaarxiv icon

Post-hoc Reward Calibration: A Case Study on Length Bias

Add code
Sep 25, 2024
Viaarxiv icon

Layerwise Recurrent Router for Mixture-of-Experts

Add code
Aug 13, 2024
Figure 1 for Layerwise Recurrent Router for Mixture-of-Experts
Figure 2 for Layerwise Recurrent Router for Mixture-of-Experts
Figure 3 for Layerwise Recurrent Router for Mixture-of-Experts
Figure 4 for Layerwise Recurrent Router for Mixture-of-Experts
Viaarxiv icon

Reconstructing Global Daily CO2 Emissions via Machine Learning

Add code
Jul 29, 2024
Figure 1 for Reconstructing Global Daily CO2 Emissions via Machine Learning
Figure 2 for Reconstructing Global Daily CO2 Emissions via Machine Learning
Figure 3 for Reconstructing Global Daily CO2 Emissions via Machine Learning
Figure 4 for Reconstructing Global Daily CO2 Emissions via Machine Learning
Viaarxiv icon

A Closer Look into Mixture-of-Experts in Large Language Models

Add code
Jun 26, 2024
Figure 1 for A Closer Look into Mixture-of-Experts in Large Language Models
Figure 2 for A Closer Look into Mixture-of-Experts in Large Language Models
Figure 3 for A Closer Look into Mixture-of-Experts in Large Language Models
Figure 4 for A Closer Look into Mixture-of-Experts in Large Language Models
Viaarxiv icon

Unlocking Continual Learning Abilities in Language Models

Add code
Jun 25, 2024
Figure 1 for Unlocking Continual Learning Abilities in Language Models
Figure 2 for Unlocking Continual Learning Abilities in Language Models
Figure 3 for Unlocking Continual Learning Abilities in Language Models
Figure 4 for Unlocking Continual Learning Abilities in Language Models
Viaarxiv icon

GW-MoE: Resolving Uncertainty in MoE Router with Global Workspace Theory

Add code
Jun 18, 2024
Viaarxiv icon

Stacking Your Transformers: A Closer Look at Model Growth for Efficient LLM Pre-Training

Add code
May 24, 2024
Viaarxiv icon

A Survey on Multilingual Large Language Models: Corpora, Alignment, and Bias

Add code
Apr 01, 2024
Viaarxiv icon

HyperMoE: Paying Attention to Unselected Experts in Mixture of Experts via Dynamic Transfer

Add code
Feb 25, 2024
Viaarxiv icon