Picture for Ting Cao

Ting Cao

Microsoft Research

Bitnet.cpp: Efficient Edge Inference for Ternary LLMs

Add code
Feb 17, 2025
Viaarxiv icon

LUT-DLA: Lookup Table as Efficient Extreme Low-Bit Deep Learning Accelerator

Add code
Jan 18, 2025
Viaarxiv icon

LeMo: Enabling LEss Token Involvement for MOre Context Fine-tuning

Add code
Jan 15, 2025
Viaarxiv icon

Ripple: Accelerating LLM Inference on Smartphones with Correlation-Aware Neuron Management

Add code
Oct 29, 2024
Figure 1 for Ripple: Accelerating LLM Inference on Smartphones with Correlation-Aware Neuron Management
Figure 2 for Ripple: Accelerating LLM Inference on Smartphones with Correlation-Aware Neuron Management
Figure 3 for Ripple: Accelerating LLM Inference on Smartphones with Correlation-Aware Neuron Management
Figure 4 for Ripple: Accelerating LLM Inference on Smartphones with Correlation-Aware Neuron Management
Viaarxiv icon

Making Every Frame Matter: Continuous Video Understanding for Large Models via Adaptive State Modeling

Add code
Oct 19, 2024
Viaarxiv icon

SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs

Add code
Oct 17, 2024
Figure 1 for SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs
Figure 2 for SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs
Figure 3 for SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs
Figure 4 for SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs
Viaarxiv icon

VPTQ: Extreme Low-bit Vector Post-Training Quantization for Large Language Models

Add code
Sep 25, 2024
Figure 1 for VPTQ: Extreme Low-bit Vector Post-Training Quantization for Large Language Models
Figure 2 for VPTQ: Extreme Low-bit Vector Post-Training Quantization for Large Language Models
Figure 3 for VPTQ: Extreme Low-bit Vector Post-Training Quantization for Large Language Models
Figure 4 for VPTQ: Extreme Low-bit Vector Post-Training Quantization for Large Language Models
Viaarxiv icon

LUT Tensor Core: Lookup Table Enables Efficient Low-Bit LLM Inference Acceleration

Add code
Aug 12, 2024
Viaarxiv icon

Advancing Multi-Modal Sensing Through Expandable Modality Alignment

Add code
Jul 25, 2024
Viaarxiv icon

T-MAC: CPU Renaissance via Table Lookup for Low-Bit LLM Deployment on Edge

Add code
Jun 25, 2024
Viaarxiv icon