Picture for Ting Cao

Ting Cao

Microsoft Research

BitDecoding: Unlocking Tensor Cores for Long-Context LLMs Decoding with Low-Bit KV Cache

Add code
Mar 24, 2025
Viaarxiv icon

Advancing Mobile GUI Agents: A Verifier-Driven Approach to Practical Deployment

Add code
Mar 21, 2025
Viaarxiv icon

StreamMind: Unlocking Full Frame Rate Streaming Video Dialogue through Event-Gated Cognition

Add code
Mar 08, 2025
Viaarxiv icon

Bitnet.cpp: Efficient Edge Inference for Ternary LLMs

Add code
Feb 17, 2025
Viaarxiv icon

LUT-DLA: Lookup Table as Efficient Extreme Low-Bit Deep Learning Accelerator

Add code
Jan 18, 2025
Viaarxiv icon

LeMo: Enabling LEss Token Involvement for MOre Context Fine-tuning

Add code
Jan 15, 2025
Figure 1 for LeMo: Enabling LEss Token Involvement for MOre Context Fine-tuning
Figure 2 for LeMo: Enabling LEss Token Involvement for MOre Context Fine-tuning
Figure 3 for LeMo: Enabling LEss Token Involvement for MOre Context Fine-tuning
Figure 4 for LeMo: Enabling LEss Token Involvement for MOre Context Fine-tuning
Viaarxiv icon

Ripple: Accelerating LLM Inference on Smartphones with Correlation-Aware Neuron Management

Add code
Oct 29, 2024
Figure 1 for Ripple: Accelerating LLM Inference on Smartphones with Correlation-Aware Neuron Management
Figure 2 for Ripple: Accelerating LLM Inference on Smartphones with Correlation-Aware Neuron Management
Figure 3 for Ripple: Accelerating LLM Inference on Smartphones with Correlation-Aware Neuron Management
Figure 4 for Ripple: Accelerating LLM Inference on Smartphones with Correlation-Aware Neuron Management
Viaarxiv icon

Making Every Frame Matter: Continuous Video Understanding for Large Models via Adaptive State Modeling

Add code
Oct 19, 2024
Viaarxiv icon

SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs

Add code
Oct 17, 2024
Figure 1 for SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs
Figure 2 for SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs
Figure 3 for SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs
Figure 4 for SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs
Viaarxiv icon

VPTQ: Extreme Low-bit Vector Post-Training Quantization for Large Language Models

Add code
Sep 25, 2024
Figure 1 for VPTQ: Extreme Low-bit Vector Post-Training Quantization for Large Language Models
Figure 2 for VPTQ: Extreme Low-bit Vector Post-Training Quantization for Large Language Models
Figure 3 for VPTQ: Extreme Low-bit Vector Post-Training Quantization for Large Language Models
Figure 4 for VPTQ: Extreme Low-bit Vector Post-Training Quantization for Large Language Models
Viaarxiv icon