Picture for Jiawei Zhao

Jiawei Zhao

HeadInfer: Memory-Efficient LLM Inference by Head-wise Offloading

Add code
Feb 18, 2025
Viaarxiv icon

ParetoQ: Scaling Laws in Extremely Low-bit LLM Quantization

Add code
Feb 04, 2025
Viaarxiv icon

Tensor-GaLore: Memory-Efficient Training via Gradient Tensor Decomposition

Add code
Jan 04, 2025
Figure 1 for Tensor-GaLore: Memory-Efficient Training via Gradient Tensor Decomposition
Figure 2 for Tensor-GaLore: Memory-Efficient Training via Gradient Tensor Decomposition
Figure 3 for Tensor-GaLore: Memory-Efficient Training via Gradient Tensor Decomposition
Figure 4 for Tensor-GaLore: Memory-Efficient Training via Gradient Tensor Decomposition
Viaarxiv icon

S$^{2}$FT: Efficient, Scalable and Generalizable LLM Fine-tuning by Structured Sparsity

Add code
Dec 10, 2024
Figure 1 for S$^{2}$FT: Efficient, Scalable and Generalizable LLM Fine-tuning by Structured Sparsity
Figure 2 for S$^{2}$FT: Efficient, Scalable and Generalizable LLM Fine-tuning by Structured Sparsity
Figure 3 for S$^{2}$FT: Efficient, Scalable and Generalizable LLM Fine-tuning by Structured Sparsity
Figure 4 for S$^{2}$FT: Efficient, Scalable and Generalizable LLM Fine-tuning by Structured Sparsity
Viaarxiv icon

Focus on BEV: Self-calibrated Cycle View Transformation for Monocular Birds-Eye-View Segmentation

Add code
Oct 21, 2024
Figure 1 for Focus on BEV: Self-calibrated Cycle View Transformation for Monocular Birds-Eye-View Segmentation
Figure 2 for Focus on BEV: Self-calibrated Cycle View Transformation for Monocular Birds-Eye-View Segmentation
Figure 3 for Focus on BEV: Self-calibrated Cycle View Transformation for Monocular Birds-Eye-View Segmentation
Figure 4 for Focus on BEV: Self-calibrated Cycle View Transformation for Monocular Birds-Eye-View Segmentation
Viaarxiv icon

Prefix Guidance: A Steering Wheel for Large Language Models to Defend Against Jailbreak Attacks

Add code
Aug 22, 2024
Viaarxiv icon

MINI-SEQUENCE TRANSFORMER: Optimizing Intermediate Memory for Long Sequences Training

Add code
Jul 22, 2024
Figure 1 for MINI-SEQUENCE TRANSFORMER: Optimizing Intermediate Memory for Long Sequences Training
Figure 2 for MINI-SEQUENCE TRANSFORMER: Optimizing Intermediate Memory for Long Sequences Training
Figure 3 for MINI-SEQUENCE TRANSFORMER: Optimizing Intermediate Memory for Long Sequences Training
Figure 4 for MINI-SEQUENCE TRANSFORMER: Optimizing Intermediate Memory for Long Sequences Training
Viaarxiv icon

From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients

Add code
Jul 15, 2024
Figure 1 for From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients
Figure 2 for From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients
Figure 3 for From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients
Figure 4 for From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients
Viaarxiv icon

Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients

Add code
Jul 11, 2024
Figure 1 for Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients
Figure 2 for Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients
Figure 3 for Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients
Figure 4 for Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients
Viaarxiv icon

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Add code
Mar 06, 2024
Viaarxiv icon