Picture for Ting Cao

Ting Cao

Ripple: Accelerating LLM Inference on Smartphones with Correlation-Aware Neuron Management

Add code
Oct 29, 2024
Viaarxiv icon

Making Every Frame Matter: Continuous Video Understanding for Large Models via Adaptive State Modeling

Add code
Oct 19, 2024
Viaarxiv icon

SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs

Add code
Oct 17, 2024
Figure 1 for SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs
Figure 2 for SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs
Figure 3 for SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs
Figure 4 for SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs
Viaarxiv icon

VPTQ: Extreme Low-bit Vector Post-Training Quantization for Large Language Models

Add code
Sep 25, 2024
Viaarxiv icon

LUT Tensor Core: Lookup Table Enables Efficient Low-Bit LLM Inference Acceleration

Add code
Aug 12, 2024
Viaarxiv icon

Advancing Multi-Modal Sensing Through Expandable Modality Alignment

Add code
Jul 25, 2024
Viaarxiv icon

T-MAC: CPU Renaissance via Table Lookup for Low-Bit LLM Deployment on Edge

Add code
Jun 25, 2024
Viaarxiv icon

BitDistiller: Unleashing the Potential of Sub-4-Bit LLMs via Self-Distillation

Add code
Feb 16, 2024
Viaarxiv icon

Exploring the Impact of In-Browser Deep Learning Inference on Quality of User Experience and Performance

Add code
Feb 08, 2024
Figure 1 for Exploring the Impact of In-Browser Deep Learning Inference on Quality of User Experience and Performance
Figure 2 for Exploring the Impact of In-Browser Deep Learning Inference on Quality of User Experience and Performance
Figure 3 for Exploring the Impact of In-Browser Deep Learning Inference on Quality of User Experience and Performance
Figure 4 for Exploring the Impact of In-Browser Deep Learning Inference on Quality of User Experience and Performance
Viaarxiv icon

AFPQ: Asymmetric Floating Point Quantization for LLMs

Add code
Nov 03, 2023
Viaarxiv icon