Picture for Zili Wang

Zili Wang

Predictable Scale: Part I -- Optimal Hyperparameter Scaling Law in Large Language Model Pretraining

Add code
Mar 06, 2025
Viaarxiv icon

SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines

Add code
Feb 20, 2025
Viaarxiv icon

Sailor2: Sailing in South-East Asia with Inclusive Multilingual LLMs

Add code
Feb 18, 2025
Viaarxiv icon

Multi-matrix Factorization Attention

Add code
Dec 26, 2024
Figure 1 for Multi-matrix Factorization Attention
Figure 2 for Multi-matrix Factorization Attention
Figure 3 for Multi-matrix Factorization Attention
Figure 4 for Multi-matrix Factorization Attention
Viaarxiv icon

Continuous Speculative Decoding for Autoregressive Image Generation

Add code
Nov 18, 2024
Viaarxiv icon

OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models

Add code
Nov 07, 2024
Figure 1 for OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models
Figure 2 for OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models
Figure 3 for OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models
Figure 4 for OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models
Viaarxiv icon

BoxMap: Efficient Structural Mapping and Navigation

Add code
Oct 08, 2024
Figure 1 for BoxMap: Efficient Structural Mapping and Navigation
Figure 2 for BoxMap: Efficient Structural Mapping and Navigation
Figure 3 for BoxMap: Efficient Structural Mapping and Navigation
Figure 4 for BoxMap: Efficient Structural Mapping and Navigation
Viaarxiv icon

Post-hoc Reward Calibration: A Case Study on Length Bias

Add code
Sep 25, 2024
Figure 1 for Post-hoc Reward Calibration: A Case Study on Length Bias
Figure 2 for Post-hoc Reward Calibration: A Case Study on Length Bias
Figure 3 for Post-hoc Reward Calibration: A Case Study on Length Bias
Figure 4 for Post-hoc Reward Calibration: A Case Study on Length Bias
Viaarxiv icon

Draw an Audio: Leveraging Multi-Instruction for Video-to-Audio Synthesis

Add code
Sep 10, 2024
Figure 1 for Draw an Audio: Leveraging Multi-Instruction for Video-to-Audio Synthesis
Figure 2 for Draw an Audio: Leveraging Multi-Instruction for Video-to-Audio Synthesis
Figure 3 for Draw an Audio: Leveraging Multi-Instruction for Video-to-Audio Synthesis
Figure 4 for Draw an Audio: Leveraging Multi-Instruction for Video-to-Audio Synthesis
Viaarxiv icon

Layerwise Recurrent Router for Mixture-of-Experts

Add code
Aug 13, 2024
Figure 1 for Layerwise Recurrent Router for Mixture-of-Experts
Figure 2 for Layerwise Recurrent Router for Mixture-of-Experts
Figure 3 for Layerwise Recurrent Router for Mixture-of-Experts
Figure 4 for Layerwise Recurrent Router for Mixture-of-Experts
Viaarxiv icon