Picture for Li Lyna Zhang

Li Lyna Zhang

VPTQ: Extreme Low-bit Vector Post-Training Quantization for Large Language Models

Add code
Sep 25, 2024
Viaarxiv icon

Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers

Add code
Aug 12, 2024
Viaarxiv icon

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

Add code
Apr 23, 2024
Figure 1 for Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
Figure 2 for Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
Figure 3 for Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
Figure 4 for Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
Viaarxiv icon

LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens

Add code
Feb 21, 2024
Viaarxiv icon

Boosting LLM Reasoning: Push the Limits of Few-shot Learning with Reinforced In-Context Pruning

Add code
Dec 26, 2023
Viaarxiv icon

Compresso: Structured Pruning with Collaborative Prompting Learns Compact Large Language Models

Add code
Oct 11, 2023
Viaarxiv icon

Constraint-aware and Ranking-distilled Token Pruning for Efficient Transformer Inference

Add code
Jun 26, 2023
Viaarxiv icon

Accurate and Structured Pruning for Efficient Automatic Speech Recognition

Add code
May 31, 2023
Figure 1 for Accurate and Structured Pruning for Efficient Automatic Speech Recognition
Figure 2 for Accurate and Structured Pruning for Efficient Automatic Speech Recognition
Figure 3 for Accurate and Structured Pruning for Efficient Automatic Speech Recognition
Figure 4 for Accurate and Structured Pruning for Efficient Automatic Speech Recognition
Viaarxiv icon

ElasticViT: Conflict-aware Supernet Training for Deploying Fast Vision Transformer on Diverse Mobile Devices

Add code
Mar 21, 2023
Viaarxiv icon

SpaceEvo: Hardware-Friendly Search Space Design for Efficient INT8 Inference

Add code
Mar 15, 2023
Viaarxiv icon