Picture for Zangwei Zheng

Zangwei Zheng

Dataset Growth

Add code
May 28, 2024
Viaarxiv icon

How Does the Textual Information Affect the Retrieval of Multimodal In-Context Learning?

Add code
Apr 19, 2024
Viaarxiv icon

DSP: Dynamic Sequence Parallelism for Multi-Dimensional Transformers

Add code
Mar 15, 2024
Viaarxiv icon

Helen: Optimizing CTR Prediction Models with Frequency-wise Hessian Eigenvalue Regularization

Add code
Feb 23, 2024
Viaarxiv icon

OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models

Add code
Jan 29, 2024
Figure 1 for OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models
Figure 2 for OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models
Figure 3 for OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models
Figure 4 for OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models
Viaarxiv icon

CAME: Confidence-guided Adaptive Memory Efficient Optimization

Add code
Jul 05, 2023
Viaarxiv icon

To Repeat or Not To Repeat: Insights from Scaling LLM under Token-Crisis

Add code
May 22, 2023
Viaarxiv icon

Response Length Perception and Sequence Scheduling: An LLM-Empowered LLM Inference Pipeline

Add code
May 22, 2023
Viaarxiv icon

Preventing Zero-Shot Transfer Degradation in Continual Learning of Vision-Language Models

Add code
Mar 12, 2023
Viaarxiv icon

InfoBatch: Lossless Training Speed Up by Unbiased Dynamic Data Pruning

Add code
Mar 08, 2023
Viaarxiv icon