Picture for Yuhui Xu

Yuhui Xu

Separated Contrastive Learning for Matching in Cross-domain Recommendation with Curriculum Scheduling

Add code
Feb 22, 2025
Viaarxiv icon

Reward Models Identify Consistency, Not Causality

Add code
Feb 20, 2025
Viaarxiv icon

Reward-Guided Speculative Decoding for Efficient LLM Reasoning

Add code
Jan 31, 2025
Figure 1 for Reward-Guided Speculative Decoding for Efficient LLM Reasoning
Figure 2 for Reward-Guided Speculative Decoding for Efficient LLM Reasoning
Figure 3 for Reward-Guided Speculative Decoding for Efficient LLM Reasoning
Figure 4 for Reward-Guided Speculative Decoding for Efficient LLM Reasoning
Viaarxiv icon

GaLore$+$: Boosting Low-Rank Adaptation for LLMs with Cross-Head Projection

Add code
Dec 15, 2024
Figure 1 for GaLore$+$: Boosting Low-Rank Adaptation for LLMs with Cross-Head Projection
Figure 2 for GaLore$+$: Boosting Low-Rank Adaptation for LLMs with Cross-Head Projection
Figure 3 for GaLore$+$: Boosting Low-Rank Adaptation for LLMs with Cross-Head Projection
Figure 4 for GaLore$+$: Boosting Low-Rank Adaptation for LLMs with Cross-Head Projection
Viaarxiv icon

MathHay: An Automated Benchmark for Long-Context Mathematical Reasoning in LLMs

Add code
Oct 07, 2024
Figure 1 for MathHay: An Automated Benchmark for Long-Context Mathematical Reasoning in LLMs
Figure 2 for MathHay: An Automated Benchmark for Long-Context Mathematical Reasoning in LLMs
Figure 3 for MathHay: An Automated Benchmark for Long-Context Mathematical Reasoning in LLMs
Figure 4 for MathHay: An Automated Benchmark for Long-Context Mathematical Reasoning in LLMs
Viaarxiv icon

ThinK: Thinner Key Cache by Query-Driven Pruning

Add code
Jul 30, 2024
Viaarxiv icon

One QuantLLM for ALL: Fine-tuning Quantized LLMs Once for Efficient Deployments

Add code
May 30, 2024
Figure 1 for One QuantLLM for ALL: Fine-tuning Quantized LLMs Once for Efficient Deployments
Figure 2 for One QuantLLM for ALL: Fine-tuning Quantized LLMs Once for Efficient Deployments
Figure 3 for One QuantLLM for ALL: Fine-tuning Quantized LLMs Once for Efficient Deployments
Figure 4 for One QuantLLM for ALL: Fine-tuning Quantized LLMs Once for Efficient Deployments
Viaarxiv icon

SPP: Sparsity-Preserved Parameter-Efficient Fine-Tuning for Large Language Models

Add code
May 25, 2024
Figure 1 for SPP: Sparsity-Preserved Parameter-Efficient Fine-Tuning for Large Language Models
Figure 2 for SPP: Sparsity-Preserved Parameter-Efficient Fine-Tuning for Large Language Models
Figure 3 for SPP: Sparsity-Preserved Parameter-Efficient Fine-Tuning for Large Language Models
Figure 4 for SPP: Sparsity-Preserved Parameter-Efficient Fine-Tuning for Large Language Models
Viaarxiv icon

TerDiT: Ternary Diffusion Models with Transformers

Add code
May 23, 2024
Figure 1 for TerDiT: Ternary Diffusion Models with Transformers
Figure 2 for TerDiT: Ternary Diffusion Models with Transformers
Figure 3 for TerDiT: Ternary Diffusion Models with Transformers
Figure 4 for TerDiT: Ternary Diffusion Models with Transformers
Viaarxiv icon

Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models

Add code
Feb 22, 2024
Figure 1 for Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models
Figure 2 for Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models
Figure 3 for Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models
Figure 4 for Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models
Viaarxiv icon