Picture for Zhiyuan Zeng

Zhiyuan Zeng

Exploring the Benefit of Activation Sparsity in Pre-training

Add code
Oct 04, 2024
Viaarxiv icon

Aggregation of Reasoning: A Hierarchical Framework for Enhancing Answer Selection in Large Language Models

Add code
May 21, 2024
Viaarxiv icon

WanJuan-CC: A Safe and High-Quality Open-sourced English Webtext Dataset

Add code
Mar 12, 2024
Figure 1 for WanJuan-CC: A Safe and High-Quality Open-sourced English Webtext Dataset
Figure 2 for WanJuan-CC: A Safe and High-Quality Open-sourced English Webtext Dataset
Figure 3 for WanJuan-CC: A Safe and High-Quality Open-sourced English Webtext Dataset
Figure 4 for WanJuan-CC: A Safe and High-Quality Open-sourced English Webtext Dataset
Viaarxiv icon

Turn Waste into Worth: Rectifying Top-$k$ Router of MoE

Add code
Feb 21, 2024
Viaarxiv icon

Query of CC: Unearthing Large Scale Domain-Specific Knowledge from Public Corpora

Add code
Jan 26, 2024
Viaarxiv icon

Evaluating Large Language Models at Evaluating Instruction Following

Add code
Oct 11, 2023
Viaarxiv icon

Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning

Add code
Oct 10, 2023
Viaarxiv icon

Plug-and-Play Knowledge Injection for Pre-trained Language Models

Add code
May 28, 2023
Figure 1 for Plug-and-Play Knowledge Injection for Pre-trained Language Models
Figure 2 for Plug-and-Play Knowledge Injection for Pre-trained Language Models
Figure 3 for Plug-and-Play Knowledge Injection for Pre-trained Language Models
Figure 4 for Plug-and-Play Knowledge Injection for Pre-trained Language Models
Viaarxiv icon

Emergent Modularity in Pre-trained Transformers

Add code
May 28, 2023
Viaarxiv icon

KNIFE: Knowledge Distillation with Free-Text Rationales

Add code
Dec 19, 2022
Viaarxiv icon