Picture for Tong Zhu

Tong Zhu

LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of Mixture-of-Experts with Post-Training

Add code
Nov 24, 2024
Viaarxiv icon

NesTools: A Dataset for Evaluating Nested Tool Learning Abilities of Large Language Models

Add code
Oct 15, 2024
Viaarxiv icon

CLIP-MoE: Towards Building Mixture of Experts for CLIP with Diversified Multiplet Upcycling

Add code
Sep 28, 2024
Viaarxiv icon

ConflictBank: A Benchmark for Evaluating the Influence of Knowledge Conflicts in LLM

Add code
Aug 22, 2024
Viaarxiv icon

Learning to Refuse: Towards Mitigating Privacy Risks in LLMs

Add code
Jul 14, 2024
Viaarxiv icon

LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training

Add code
Jun 24, 2024
Viaarxiv icon

Timo: Towards Better Temporal Reasoning for Language Models

Add code
Jun 20, 2024
Viaarxiv icon

Dynamic Data Mixing Maximizes Instruction Tuning for Mixture-of-Experts

Add code
Jun 17, 2024
Figure 1 for Dynamic Data Mixing Maximizes Instruction Tuning for Mixture-of-Experts
Figure 2 for Dynamic Data Mixing Maximizes Instruction Tuning for Mixture-of-Experts
Figure 3 for Dynamic Data Mixing Maximizes Instruction Tuning for Mixture-of-Experts
Figure 4 for Dynamic Data Mixing Maximizes Instruction Tuning for Mixture-of-Experts
Viaarxiv icon

Living in the Moment: Can Large Language Models Grasp Co-Temporal Reasoning?

Add code
Jun 13, 2024
Viaarxiv icon

Probing Language Models for Pre-training Data Detection

Add code
Jun 03, 2024
Viaarxiv icon