Picture for Zhihang Yuan

Zhihang Yuan

MVCTrack: Boosting 3D Point Cloud Tracking via Multimodal-Guided Virtual Cues

Add code
Dec 03, 2024
Viaarxiv icon

LiteVAR: Compressing Visual Autoregressive Modelling with Efficient Attention and Quantization

Add code
Nov 26, 2024
Viaarxiv icon

CSKV: Training-Efficient Channel Shrinking for KV Cache in Long-Context Scenarios

Add code
Sep 16, 2024
Figure 1 for CSKV: Training-Efficient Channel Shrinking for KV Cache in Long-Context Scenarios
Figure 2 for CSKV: Training-Efficient Channel Shrinking for KV Cache in Long-Context Scenarios
Figure 3 for CSKV: Training-Efficient Channel Shrinking for KV Cache in Long-Context Scenarios
Figure 4 for CSKV: Training-Efficient Channel Shrinking for KV Cache in Long-Context Scenarios
Viaarxiv icon

Learning High-Frequency Functions Made Easy with Sinusoidal Positional Encoding

Add code
Jul 12, 2024
Viaarxiv icon

DiTFastAttn: Attention Compression for Diffusion Transformer Models

Add code
Jun 12, 2024
Figure 1 for DiTFastAttn: Attention Compression for Diffusion Transformer Models
Figure 2 for DiTFastAttn: Attention Compression for Diffusion Transformer Models
Figure 3 for DiTFastAttn: Attention Compression for Diffusion Transformer Models
Figure 4 for DiTFastAttn: Attention Compression for Diffusion Transformer Models
Viaarxiv icon

PillarHist: A Quantization-aware Pillar Feature Encoder based on Height-aware Histogram

Add code
May 29, 2024
Figure 1 for PillarHist: A Quantization-aware Pillar Feature Encoder based on Height-aware Histogram
Figure 2 for PillarHist: A Quantization-aware Pillar Feature Encoder based on Height-aware Histogram
Figure 3 for PillarHist: A Quantization-aware Pillar Feature Encoder based on Height-aware Histogram
Figure 4 for PillarHist: A Quantization-aware Pillar Feature Encoder based on Height-aware Histogram
Viaarxiv icon

I-LLM: Efficient Integer-Only Inference for Fully-Quantized Low-Bit Large Language Models

Add code
May 28, 2024
Viaarxiv icon

A Closer Look at Time Steps is Worthy of Triple Speed-Up for Diffusion Model Training

Add code
May 27, 2024
Viaarxiv icon

SKVQ: Sliding-window Key and Value Cache Quantization for Large Language Models

Add code
May 10, 2024
Viaarxiv icon

A Survey on Efficient Inference for Large Language Models

Add code
Apr 22, 2024
Viaarxiv icon