Picture for Weizhu Chen

Weizhu Chen

LLMs Can Generate a Better Answer by Aggregating Their Own Responses

Add code
Mar 06, 2025
Viaarxiv icon

Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs

Add code
Mar 03, 2025
Viaarxiv icon

LongRoPE2: Near-Lossless LLM Context Window Scaling

Add code
Feb 27, 2025
Figure 1 for LongRoPE2: Near-Lossless LLM Context Window Scaling
Figure 2 for LongRoPE2: Near-Lossless LLM Context Window Scaling
Figure 3 for LongRoPE2: Near-Lossless LLM Context Window Scaling
Figure 4 for LongRoPE2: Near-Lossless LLM Context Window Scaling
Viaarxiv icon

COSMOS: A Hybrid Adaptive Optimizer for Memory-Efficient Training of LLMs

Add code
Feb 26, 2025
Viaarxiv icon

Segmenting Text and Learning Their Rewards for Improved RLHF in Language Model

Add code
Jan 06, 2025
Viaarxiv icon

StreamAdapter: Efficient Test Time Adaptation from Contextual Streams

Add code
Nov 14, 2024
Figure 1 for StreamAdapter: Efficient Test Time Adaptation from Contextual Streams
Figure 2 for StreamAdapter: Efficient Test Time Adaptation from Contextual Streams
Figure 3 for StreamAdapter: Efficient Test Time Adaptation from Contextual Streams
Figure 4 for StreamAdapter: Efficient Test Time Adaptation from Contextual Streams
Viaarxiv icon

MTL-LoRA: Low-Rank Adaptation for Multi-Task Learning

Add code
Oct 15, 2024
Figure 1 for MTL-LoRA: Low-Rank Adaptation for Multi-Task Learning
Figure 2 for MTL-LoRA: Low-Rank Adaptation for Multi-Task Learning
Figure 3 for MTL-LoRA: Low-Rank Adaptation for Multi-Task Learning
Figure 4 for MTL-LoRA: Low-Rank Adaptation for Multi-Task Learning
Viaarxiv icon

GRIN: GRadient-INformed MoE

Add code
Sep 18, 2024
Figure 1 for GRIN: GRadient-INformed MoE
Figure 2 for GRIN: GRadient-INformed MoE
Figure 3 for GRIN: GRadient-INformed MoE
Figure 4 for GRIN: GRadient-INformed MoE
Viaarxiv icon

Arena Learning: Build Data Flywheel for LLMs Post-training via Simulated Chatbot Arena

Add code
Jul 15, 2024
Viaarxiv icon

Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling

Add code
Jun 11, 2024
Figure 1 for Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling
Figure 2 for Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling
Figure 3 for Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling
Figure 4 for Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling
Viaarxiv icon