Picture for Zhiqiang Shen

Zhiqiang Shen

LLM360 K2: Building a 65B 360-Open-Source Large Language Model from Scratch

Add code
Jan 16, 2025
Viaarxiv icon

Dataset Distillation via Committee Voting

Add code
Jan 13, 2025
Viaarxiv icon

DELT: A Simple Diversity-driven EarlyLate Training for Dataset Distillation

Add code
Nov 29, 2024
Viaarxiv icon

Bi-Mamba: Towards Accurate 1-Bit State Space Models

Add code
Nov 18, 2024
Viaarxiv icon

Crystal: Illuminating LLM Abilities on Language and Code

Add code
Nov 06, 2024
Figure 1 for Crystal: Illuminating LLM Abilities on Language and Code
Figure 2 for Crystal: Illuminating LLM Abilities on Language and Code
Figure 3 for Crystal: Illuminating LLM Abilities on Language and Code
Figure 4 for Crystal: Illuminating LLM Abilities on Language and Code
Viaarxiv icon

LFME: A Simple Framework for Learning from Multiple Experts in Domain Generalization

Add code
Oct 22, 2024
Figure 1 for LFME: A Simple Framework for Learning from Multiple Experts in Domain Generalization
Figure 2 for LFME: A Simple Framework for Learning from Multiple Experts in Domain Generalization
Figure 3 for LFME: A Simple Framework for Learning from Multiple Experts in Domain Generalization
Figure 4 for LFME: A Simple Framework for Learning from Multiple Experts in Domain Generalization
Viaarxiv icon

$γ-$MoD: Exploring Mixture-of-Depth Adaptation for Multimodal Large Language Models

Add code
Oct 17, 2024
Figure 1 for $γ-$MoD: Exploring Mixture-of-Depth Adaptation for Multimodal Large Language Models
Figure 2 for $γ-$MoD: Exploring Mixture-of-Depth Adaptation for Multimodal Large Language Models
Figure 3 for $γ-$MoD: Exploring Mixture-of-Depth Adaptation for Multimodal Large Language Models
Figure 4 for $γ-$MoD: Exploring Mixture-of-Depth Adaptation for Multimodal Large Language Models
Viaarxiv icon

CoreGuard: Safeguarding Foundational Capabilities of LLMs Against Model Stealing in Edge Deployment

Add code
Oct 16, 2024
Viaarxiv icon

Mamba or Transformer for Time Series Forecasting? Mixture of Universals (MoU) Is All You Need

Add code
Aug 28, 2024
Figure 1 for Mamba or Transformer for Time Series Forecasting? Mixture of Universals (MoU) Is All You Need
Figure 2 for Mamba or Transformer for Time Series Forecasting? Mixture of Universals (MoU) Is All You Need
Figure 3 for Mamba or Transformer for Time Series Forecasting? Mixture of Universals (MoU) Is All You Need
Figure 4 for Mamba or Transformer for Time Series Forecasting? Mixture of Universals (MoU) Is All You Need
Viaarxiv icon

Adaptive Mix for Semi-Supervised Medical Image Segmentation

Add code
Jul 31, 2024
Figure 1 for Adaptive Mix for Semi-Supervised Medical Image Segmentation
Figure 2 for Adaptive Mix for Semi-Supervised Medical Image Segmentation
Figure 3 for Adaptive Mix for Semi-Supervised Medical Image Segmentation
Figure 4 for Adaptive Mix for Semi-Supervised Medical Image Segmentation
Viaarxiv icon