Picture for Yuhui Xu

Yuhui Xu

MathHay: An Automated Benchmark for Long-Context Mathematical Reasoning in LLMs

Add code
Oct 07, 2024
Figure 1 for MathHay: An Automated Benchmark for Long-Context Mathematical Reasoning in LLMs
Figure 2 for MathHay: An Automated Benchmark for Long-Context Mathematical Reasoning in LLMs
Figure 3 for MathHay: An Automated Benchmark for Long-Context Mathematical Reasoning in LLMs
Figure 4 for MathHay: An Automated Benchmark for Long-Context Mathematical Reasoning in LLMs
Viaarxiv icon

ThinK: Thinner Key Cache by Query-Driven Pruning

Add code
Jul 30, 2024
Viaarxiv icon

One QuantLLM for ALL: Fine-tuning Quantized LLMs Once for Efficient Deployments

Add code
May 30, 2024
Figure 1 for One QuantLLM for ALL: Fine-tuning Quantized LLMs Once for Efficient Deployments
Figure 2 for One QuantLLM for ALL: Fine-tuning Quantized LLMs Once for Efficient Deployments
Figure 3 for One QuantLLM for ALL: Fine-tuning Quantized LLMs Once for Efficient Deployments
Figure 4 for One QuantLLM for ALL: Fine-tuning Quantized LLMs Once for Efficient Deployments
Viaarxiv icon

SPP: Sparsity-Preserved Parameter-Efficient Fine-Tuning for Large Language Models

Add code
May 25, 2024
Figure 1 for SPP: Sparsity-Preserved Parameter-Efficient Fine-Tuning for Large Language Models
Figure 2 for SPP: Sparsity-Preserved Parameter-Efficient Fine-Tuning for Large Language Models
Figure 3 for SPP: Sparsity-Preserved Parameter-Efficient Fine-Tuning for Large Language Models
Figure 4 for SPP: Sparsity-Preserved Parameter-Efficient Fine-Tuning for Large Language Models
Viaarxiv icon

TerDiT: Ternary Diffusion Models with Transformers

Add code
May 23, 2024
Figure 1 for TerDiT: Ternary Diffusion Models with Transformers
Figure 2 for TerDiT: Ternary Diffusion Models with Transformers
Figure 3 for TerDiT: Ternary Diffusion Models with Transformers
Figure 4 for TerDiT: Ternary Diffusion Models with Transformers
Viaarxiv icon

Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models

Add code
Feb 22, 2024
Figure 1 for Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models
Figure 2 for Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models
Figure 3 for Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models
Figure 4 for Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models
Viaarxiv icon

QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models

Add code
Sep 26, 2023
Figure 1 for QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models
Figure 2 for QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models
Figure 3 for QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models
Figure 4 for QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models
Viaarxiv icon

Batch Normalization with Enhanced Linear Transformation

Add code
Nov 28, 2020
Figure 1 for Batch Normalization with Enhanced Linear Transformation
Figure 2 for Batch Normalization with Enhanced Linear Transformation
Figure 3 for Batch Normalization with Enhanced Linear Transformation
Figure 4 for Batch Normalization with Enhanced Linear Transformation
Viaarxiv icon

Weight-Sharing Neural Architecture Search: A Battle to Shrink the Optimization Gap

Add code
Aug 05, 2020
Figure 1 for Weight-Sharing Neural Architecture Search: A Battle to Shrink the Optimization Gap
Figure 2 for Weight-Sharing Neural Architecture Search: A Battle to Shrink the Optimization Gap
Figure 3 for Weight-Sharing Neural Architecture Search: A Battle to Shrink the Optimization Gap
Figure 4 for Weight-Sharing Neural Architecture Search: A Battle to Shrink the Optimization Gap
Viaarxiv icon

TRP: Trained Rank Pruning for Efficient Deep Neural Networks

Add code
Apr 30, 2020
Figure 1 for TRP: Trained Rank Pruning for Efficient Deep Neural Networks
Figure 2 for TRP: Trained Rank Pruning for Efficient Deep Neural Networks
Figure 3 for TRP: Trained Rank Pruning for Efficient Deep Neural Networks
Figure 4 for TRP: Trained Rank Pruning for Efficient Deep Neural Networks
Viaarxiv icon